Overview

Brought to you by YData

Dataset statistics

Number of variables34
Number of observations19717
Missing cells221379
Missing cells (%)33.0%
Duplicate rows157
Duplicate rows (%)0.8%
Total size in memory5.1 MiB
Average record size in memory272.0 B

Variable types

Categorical16
Text18

Alerts

Dataset has 157 (0.8%) duplicate rowsDuplicates
What is your gender? is highly imbalanced (61.0%) Imbalance
What programming language would you recommend an aspiring data scientist to learn first? is highly imbalanced (65.0%) Imbalance
Have you ever used a TPU (tensor processing unit)? is highly imbalanced (57.0%) Imbalance
What is the highest level of formal education that you have attained or plan to attain within the next 2 years? has 394 (2.0%) missing values Missing
Select the title most similar to your current role (or most recent title if retired) has 610 (3.1%) missing values Missing
What is the size of the company where you are employed? has 5715 (29.0%) missing values Missing
Approximately how many individuals are responsible for data science workloads at your place of business? has 6094 (30.9%) missing values Missing
Does your current employer incorporate machine learning methods into their business? has 6490 (32.9%) missing values Missing
What is your current yearly compensation (approximate $USD)? has 7220 (36.6%) missing values Missing
Approximately how much money have you spent on machine learning and/or cloud computing products at your work in the past 5 years? has 7467 (37.9%) missing values Missing
How long have you been writing code to analyze data (at work or at school)? has 4090 (20.7%) missing values Missing
What programming language would you recommend an aspiring data scientist to learn first? has 5340 (27.1%) missing values Missing
Have you ever used a TPU (tensor processing unit)? has 5514 (28.0%) missing values Missing
For how many years have you used machine learning methods? has 5535 (28.1%) missing values Missing
Select any activities that make up an important part of your role at work: has 10491 (53.2%) missing values Missing
Who/what are your favorite media sources that report on data science topics? has 2936 (14.9%) missing values Missing
On which platforms have you begun or completed data science courses? has 3148 (16.0%) missing values Missing
Which of the following integrated development environments (IDE's) do you use on a regular basis? has 5090 (25.8%) missing values Missing
Which of the following hosted notebook products do you use on a regular basis? has 5274 (26.7%) missing values Missing
What programming languages do you use on a regular basis? has 5313 (26.9%) missing values Missing
What data visualization libraries or tools do you use on a regular basis? has 5464 (27.7%) missing values Missing
Which types of specialized hardware do you use on a regular basis? has 5499 (27.9%) missing values Missing
Which of the following ML algorithms do you use on a regular basis? has 5629 (28.5%) missing values Missing
Which categories of ML tools do you use on a regular basis? has 5802 (29.4%) missing values Missing
Which categories of computer vision methods do you use on a regular basis? has 14225 (72.1%) missing values Missing
Which of the following natural language processing (NLP) methods do you use on a regular basis? has 16135 (81.8%) missing values Missing
Which of the following machine learning frameworks do you use on a regular basis? has 5964 (30.2%) missing values Missing
Which of the following cloud computing platforms do you use on a regular basis? has 12592 (63.9%) missing values Missing
Which specific cloud computing products do you use on a regular basis? has 12617 (64.0%) missing values Missing
Which specific big data / analytics products do you use on a regular basis? has 12639 (64.1%) missing values Missing
Which of the following machine learning products do you use on a regular basis? has 12667 (64.2%) missing values Missing
Which automated machine learning tools (or partial AutoML tools) do you use on a regular basis? has 12702 (64.4%) missing values Missing
Which of the following relational database products do you use on a regular basis? has 12723 (64.5%) missing values Missing

Reproduction

Analysis started2024-11-05 13:36:20.298446
Analysis finished2024-11-05 13:36:35.480373
Duration15.18 seconds
Software versionydata-profiling vv4.12.0
Download configurationconfig.json

Variables

Distinct11
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size154.2 KiB
25-29
4458 
22-24
3610 
30-34
3120 
18-21
2502 
35-39
2087 
Other values (6)
3940 

Length

Max length5
Median length5
Mean length4.9898565
Min length3

Characters and Unicode

Total characters98385
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row22-24
2nd row40-44
3rd row55-59
4th row40-44
5th row22-24

Common Values

ValueCountFrequency (%)
25-29 4458
22.6%
22-24 3610
18.3%
30-34 3120
15.8%
18-21 2502
12.7%
35-39 2087
10.6%
40-44 1439
 
7.3%
45-49 949
 
4.8%
50-54 692
 
3.5%
55-59 422
 
2.1%
60-69 338
 
1.7%

Length

2024-11-05T21:36:35.633222image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
25-29 4458
22.6%
22-24 3610
18.3%
30-34 3120
15.8%
18-21 2502
12.7%
35-39 2087
10.6%
40-44 1439
 
7.3%
45-49 949
 
4.8%
50-54 692
 
3.5%
55-59 422
 
2.1%
60-69 338
 
1.7%

Most occurring characters

ValueCountFrequency (%)
2 22248
22.6%
- 19617
19.9%
4 13637
13.9%
3 10414
10.6%
5 10144
10.3%
9 8254
 
8.4%
0 5689
 
5.8%
1 5004
 
5.1%
8 2502
 
2.5%
6 676
 
0.7%
Other values (2) 200
 
0.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 98385
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
2 22248
22.6%
- 19617
19.9%
4 13637
13.9%
3 10414
10.6%
5 10144
10.3%
9 8254
 
8.4%
0 5689
 
5.8%
1 5004
 
5.1%
8 2502
 
2.5%
6 676
 
0.7%
Other values (2) 200
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 98385
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
2 22248
22.6%
- 19617
19.9%
4 13637
13.9%
3 10414
10.6%
5 10144
10.3%
9 8254
 
8.4%
0 5689
 
5.8%
1 5004
 
5.1%
8 2502
 
2.5%
6 676
 
0.7%
Other values (2) 200
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 98385
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
2 22248
22.6%
- 19617
19.9%
4 13637
13.9%
3 10414
10.6%
5 10144
10.3%
9 8254
 
8.4%
0 5689
 
5.8%
1 5004
 
5.1%
8 2502
 
2.5%
6 676
 
0.7%
Other values (2) 200
 
0.2%

What is your gender?
Categorical

Imbalance 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size154.2 KiB
Male
16138 
Female
3212 
Prefer not to say
 
318
Prefer to self-describe
 
49

Length

Max length23
Median length4
Mean length4.5826951
Min length4

Characters and Unicode

Total characters90357
Distinct characters20
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMale
2nd rowMale
3rd rowFemale
4th rowMale
5th rowMale

Common Values

ValueCountFrequency (%)
Male 16138
81.8%
Female 3212
 
16.3%
Prefer not to say 318
 
1.6%
Prefer to self-describe 49
 
0.2%

Length

2024-11-05T21:36:35.863065image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-05T21:36:37.384626image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
ValueCountFrequency (%)
male 16138
77.7%
female 3212
 
15.5%
prefer 367
 
1.8%
to 367
 
1.8%
not 318
 
1.5%
say 318
 
1.5%
self-describe 49
 
0.2%

Most occurring characters

ValueCountFrequency (%)
e 23443
25.9%
a 19668
21.8%
l 19399
21.5%
M 16138
17.9%
F 3212
 
3.6%
m 3212
 
3.6%
1052
 
1.2%
r 783
 
0.9%
o 685
 
0.8%
t 685
 
0.8%
Other values (10) 2080
 
2.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 90357
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 23443
25.9%
a 19668
21.8%
l 19399
21.5%
M 16138
17.9%
F 3212
 
3.6%
m 3212
 
3.6%
1052
 
1.2%
r 783
 
0.9%
o 685
 
0.8%
t 685
 
0.8%
Other values (10) 2080
 
2.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 90357
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 23443
25.9%
a 19668
21.8%
l 19399
21.5%
M 16138
17.9%
F 3212
 
3.6%
m 3212
 
3.6%
1052
 
1.2%
r 783
 
0.9%
o 685
 
0.8%
t 685
 
0.8%
Other values (10) 2080
 
2.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 90357
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 23443
25.9%
a 19668
21.8%
l 19399
21.5%
M 16138
17.9%
F 3212
 
3.6%
m 3212
 
3.6%
1052
 
1.2%
r 783
 
0.9%
o 685
 
0.8%
t 685
 
0.8%
Other values (10) 2080
 
2.3%
Distinct59
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size154.2 KiB
2024-11-05T21:36:37.832932image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Length

Max length52
Median length28
Mean length10.232642
Min length4

Characters and Unicode

Total characters201757
Distinct characters49
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFrance
2nd rowIndia
3rd rowGermany
4th rowAustralia
5th rowIndia
ValueCountFrequency (%)
india 4786
14.3%
of 3736
 
11.2%
united 3567
 
10.6%
states 3085
 
9.2%
america 3085
 
9.2%
other 1054
 
3.1%
brazil 728
 
2.2%
japan 673
 
2.0%
russia 626
 
1.9%
china 574
 
1.7%
Other values (63) 11583
34.6%
2024-11-05T21:36:38.547891image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 24865
12.3%
i 19357
 
9.6%
e 17291
 
8.6%
n 16760
 
8.3%
t 14087
 
7.0%
13780
 
6.8%
d 11371
 
5.6%
r 11349
 
5.6%
o 7043
 
3.5%
I 6091
 
3.0%
Other values (39) 59763
29.6%

Most occurring categories

ValueCountFrequency (%)
(unknown) 201757
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a 24865
12.3%
i 19357
 
9.6%
e 17291
 
8.6%
n 16760
 
8.3%
t 14087
 
7.0%
13780
 
6.8%
d 11371
 
5.6%
r 11349
 
5.6%
o 7043
 
3.5%
I 6091
 
3.0%
Other values (39) 59763
29.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 201757
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a 24865
12.3%
i 19357
 
9.6%
e 17291
 
8.6%
n 16760
 
8.3%
t 14087
 
7.0%
13780
 
6.8%
d 11371
 
5.6%
r 11349
 
5.6%
o 7043
 
3.5%
I 6091
 
3.0%
Other values (39) 59763
29.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 201757
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a 24865
12.3%
i 19357
 
9.6%
e 17291
 
8.6%
n 16760
 
8.3%
t 14087
 
7.0%
13780
 
6.8%
d 11371
 
5.6%
r 11349
 
5.6%
o 7043
 
3.5%
I 6091
 
3.0%
Other values (39) 59763
29.6%
Distinct7
Distinct (%)< 0.1%
Missing394
Missing (%)2.0%
Memory size154.2 KiB
Master’s degree
8549 
Bachelor’s degree
5993 
Doctoral degree
2767 
Some college/university study without earning a bachelor’s degree
 
837
Professional degree
 
611
Other values (2)
 
566

Length

Max length65
Median length15
Mean length18.286446
Min length15

Characters and Unicode

Total characters353349
Distinct characters31
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMaster’s degree
2nd rowProfessional degree
3rd rowProfessional degree
4th rowMaster’s degree
5th rowBachelor’s degree

Common Values

ValueCountFrequency (%)
Master’s degree 8549
43.4%
Bachelor’s degree 5993
30.4%
Doctoral degree 2767
 
14.0%
Some college/university study without earning a bachelor’s degree 837
 
4.2%
Professional degree 611
 
3.1%
I prefer not to answer 333
 
1.7%
No formal education past high school 233
 
1.2%
(Missing) 394
 
2.0%

Length

2024-11-05T21:36:38.788428image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-05T21:36:39.024425image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
ValueCountFrequency (%)
degree 18757
41.1%
master’s 8549
18.7%
bachelor’s 6830
 
15.0%
doctoral 2767
 
6.1%
some 837
 
1.8%
college/university 837
 
1.8%
study 837
 
1.8%
without 837
 
1.8%
earning 837
 
1.8%
a 837
 
1.8%
Other values (12) 3674
 
8.1%

Most occurring characters

ValueCountFrequency (%)
e 77678
22.0%
r 40420
11.4%
s 27623
 
7.8%
26276
 
7.4%
a 21463
 
6.1%
g 20664
 
5.8%
d 19827
 
5.6%
o 17928
 
5.1%
t 15796
 
4.5%
15379
 
4.4%
Other values (21) 70295
19.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 353349
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 77678
22.0%
r 40420
11.4%
s 27623
 
7.8%
26276
 
7.4%
a 21463
 
6.1%
g 20664
 
5.8%
d 19827
 
5.6%
o 17928
 
5.1%
t 15796
 
4.5%
15379
 
4.4%
Other values (21) 70295
19.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 353349
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 77678
22.0%
r 40420
11.4%
s 27623
 
7.8%
26276
 
7.4%
a 21463
 
6.1%
g 20664
 
5.8%
d 19827
 
5.6%
o 17928
 
5.1%
t 15796
 
4.5%
15379
 
4.4%
Other values (21) 70295
19.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 353349
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 77678
22.0%
r 40420
11.4%
s 27623
 
7.8%
26276
 
7.4%
a 21463
 
6.1%
g 20664
 
5.8%
d 19827
 
5.6%
o 17928
 
5.1%
t 15796
 
4.5%
15379
 
4.4%
Other values (21) 70295
19.9%
Distinct12
Distinct (%)0.1%
Missing610
Missing (%)3.1%
Memory size154.2 KiB
Data Scientist
4085 
Student
4014 
Software Engineer
2705 
Other
1690 
Data Analyst
1598 
Other values (7)
5015 

Length

Max length23
Median length18
Mean length12.61276
Min length5

Characters and Unicode

Total characters240992
Distinct characters33
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSoftware Engineer
2nd rowSoftware Engineer
3rd rowOther
4th rowOther
5th rowData Scientist

Common Values

ValueCountFrequency (%)
Data Scientist 4085
20.7%
Student 4014
20.4%
Software Engineer 2705
13.7%
Other 1690
8.6%
Data Analyst 1598
 
8.1%
Research Scientist 1470
 
7.5%
Not employed 942
 
4.8%
Business Analyst 778
 
3.9%
Product/Project Manager 723
 
3.7%
Data Engineer 624
 
3.2%
Other values (2) 478
 
2.4%
(Missing) 610
 
3.1%

Length

2024-11-05T21:36:39.388390image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
data 6307
19.6%
scientist 5555
17.3%
student 4014
12.5%
engineer 3485
10.8%
software 2705
8.4%
analyst 2376
 
7.4%
other 1690
 
5.3%
research 1470
 
4.6%
not 942
 
2.9%
employed 942
 
2.9%
Other values (5) 2702
8.4%

Most occurring characters

ValueCountFrequency (%)
t 35726
14.8%
e 28138
11.7%
a 21723
 
9.0%
n 20738
 
8.6%
i 16339
 
6.8%
13081
 
5.4%
S 12596
 
5.2%
s 12213
 
5.1%
r 11519
 
4.8%
c 8793
 
3.6%
Other values (23) 60126
24.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 240992
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
t 35726
14.8%
e 28138
11.7%
a 21723
 
9.0%
n 20738
 
8.6%
i 16339
 
6.8%
13081
 
5.4%
S 12596
 
5.2%
s 12213
 
5.1%
r 11519
 
4.8%
c 8793
 
3.6%
Other values (23) 60126
24.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 240992
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
t 35726
14.8%
e 28138
11.7%
a 21723
 
9.0%
n 20738
 
8.6%
i 16339
 
6.8%
13081
 
5.4%
S 12596
 
5.2%
s 12213
 
5.1%
r 11519
 
4.8%
c 8793
 
3.6%
Other values (23) 60126
24.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 240992
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
t 35726
14.8%
e 28138
11.7%
a 21723
 
9.0%
n 20738
 
8.6%
i 16339
 
6.8%
13081
 
5.4%
S 12596
 
5.2%
s 12213
 
5.1%
r 11519
 
4.8%
c 8793
 
3.6%
Other values (23) 60126
24.9%
Distinct5
Distinct (%)< 0.1%
Missing5715
Missing (%)29.0%
Memory size154.2 KiB
0-49 employees
4025 
> 10,000 employees
3160 
1000-9,999 employees
2641 
50-249 employees
2329 
250-999 employees
1847 

Length

Max length20
Median length18
Mean length16.76282
Min length14

Characters and Unicode

Total characters234713
Distinct characters17
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1000-9,999 employees
2nd row> 10,000 employees
3rd row> 10,000 employees
4th row0-49 employees
5th row0-49 employees

Common Values

ValueCountFrequency (%)
0-49 employees 4025
20.4%
> 10,000 employees 3160
16.0%
1000-9,999 employees 2641
13.4%
50-249 employees 2329
11.8%
250-999 employees 1847
 
9.4%
(Missing) 5715
29.0%

Length

2024-11-05T21:36:39.702010image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-05T21:36:40.048141image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
ValueCountFrequency (%)
employees 14002
44.9%
0-49 4025
 
12.9%
3160
 
10.1%
10,000 3160
 
10.1%
1000-9,999 2641
 
8.5%
50-249 2329
 
7.5%
250-999 1847
 
5.9%

Most occurring characters

ValueCountFrequency (%)
e 42006
17.9%
0 28764
12.3%
9 22459
9.6%
17162
7.3%
o 14002
 
6.0%
s 14002
 
6.0%
y 14002
 
6.0%
l 14002
 
6.0%
p 14002
 
6.0%
m 14002
 
6.0%
Other values (7) 40310
17.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 234713
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 42006
17.9%
0 28764
12.3%
9 22459
9.6%
17162
7.3%
o 14002
 
6.0%
s 14002
 
6.0%
y 14002
 
6.0%
l 14002
 
6.0%
p 14002
 
6.0%
m 14002
 
6.0%
Other values (7) 40310
17.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 234713
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 42006
17.9%
0 28764
12.3%
9 22459
9.6%
17162
7.3%
o 14002
 
6.0%
s 14002
 
6.0%
y 14002
 
6.0%
l 14002
 
6.0%
p 14002
 
6.0%
m 14002
 
6.0%
Other values (7) 40310
17.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 234713
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 42006
17.9%
0 28764
12.3%
9 22459
9.6%
17162
7.3%
o 14002
 
6.0%
s 14002
 
6.0%
y 14002
 
6.0%
l 14002
 
6.0%
p 14002
 
6.0%
m 14002
 
6.0%
Other values (7) 40310
17.2%
Distinct7
Distinct (%)0.1%
Missing6094
Missing (%)30.9%
Memory size154.2 KiB
20+
3178 
1-2
3005 
3-4
2319 
0
1880 
5-9
1847 
Other values (2)
1394 

Length

Max length5
Median length3
Mean length2.9286501
Min length1

Characters and Unicode

Total characters39897
Distinct characters9
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row20+
3rd row20+
4th row0
5th row3-4

Common Values

ValueCountFrequency (%)
20+ 3178
16.1%
1-2 3005
15.2%
3-4 2319
 
11.8%
0 1880
 
9.5%
5-9 1847
 
9.4%
10-14 967
 
4.9%
15-19 427
 
2.2%
(Missing) 6094
30.9%

Length

2024-11-05T21:36:40.349687image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-05T21:36:40.594590image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
ValueCountFrequency (%)
20 3178
23.3%
1-2 3005
22.1%
3-4 2319
17.0%
0 1880
13.8%
5-9 1847
13.6%
10-14 967
 
7.1%
15-19 427
 
3.1%

Most occurring characters

ValueCountFrequency (%)
- 8565
21.5%
2 6183
15.5%
0 6025
15.1%
1 5793
14.5%
4 3286
 
8.2%
+ 3178
 
8.0%
3 2319
 
5.8%
5 2274
 
5.7%
9 2274
 
5.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 39897
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
- 8565
21.5%
2 6183
15.5%
0 6025
15.1%
1 5793
14.5%
4 3286
 
8.2%
+ 3178
 
8.0%
3 2319
 
5.8%
5 2274
 
5.7%
9 2274
 
5.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 39897
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
- 8565
21.5%
2 6183
15.5%
0 6025
15.1%
1 5793
14.5%
4 3286
 
8.2%
+ 3178
 
8.0%
3 2319
 
5.8%
5 2274
 
5.7%
9 2274
 
5.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 39897
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
- 8565
21.5%
2 6183
15.5%
0 6025
15.1%
1 5793
14.5%
4 3286
 
8.2%
+ 3178
 
8.0%
3 2319
 
5.8%
5 2274
 
5.7%
9 2274
 
5.7%
Distinct6
Distinct (%)< 0.1%
Missing6490
Missing (%)32.9%
Memory size154.2 KiB
We are exploring ML methods (and may one day put a model into production)
2812 
We recently started using ML methods (i.e., models in production for less than 2 years)
2731 
We have well established ML methods (i.e., models in production for more than 2 years)
2528 
No (we do not use ML methods)
2415 
We use ML methods for generating insights (but do not put working models into production)
1550 

Length

Max length89
Median length86
Mean length66.814017
Min length13

Characters and Unicode

Total characters883749
Distinct characters34
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowI do not know
2nd rowWe have well established ML methods (i.e., models in production for more than 2 years)
3rd rowI do not know
4th rowNo (we do not use ML methods)
5th rowWe have well established ML methods (i.e., models in production for more than 2 years)

Common Values

ValueCountFrequency (%)
We are exploring ML methods (and may one day put a model into production) 2812
14.3%
We recently started using ML methods (i.e., models in production for less than 2 years) 2731
13.9%
We have well established ML methods (i.e., models in production for more than 2 years) 2528
 
12.8%
No (we do not use ML methods) 2415
 
12.2%
We use ML methods for generating insights (but do not put working models into production) 1550
 
7.9%
I do not know 1191
 
6.0%
(Missing) 6490
32.9%

Length

2024-11-05T21:36:40.899593image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-05T21:36:41.248591image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
ValueCountFrequency (%)
we 12036
 
7.4%
ml 12036
 
7.4%
methods 12036
 
7.4%
production 9621
 
5.9%
for 6809
 
4.2%
models 6809
 
4.2%
years 5259
 
3.2%
2 5259
 
3.2%
than 5259
 
3.2%
i.e 5259
 
3.2%
Other values (29) 82789
50.7%

Most occurring characters

ValueCountFrequency (%)
149945
17.0%
e 83276
 
9.4%
o 75690
 
8.6%
t 56167
 
6.4%
n 50946
 
5.8%
d 47317
 
5.4%
s 47149
 
5.3%
i 38772
 
4.4%
r 38403
 
4.3%
a 33915
 
3.8%
Other values (24) 262169
29.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 883749
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
149945
17.0%
e 83276
 
9.4%
o 75690
 
8.6%
t 56167
 
6.4%
n 50946
 
5.8%
d 47317
 
5.4%
s 47149
 
5.3%
i 38772
 
4.4%
r 38403
 
4.3%
a 33915
 
3.8%
Other values (24) 262169
29.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 883749
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
149945
17.0%
e 83276
 
9.4%
o 75690
 
8.6%
t 56167
 
6.4%
n 50946
 
5.8%
d 47317
 
5.4%
s 47149
 
5.3%
i 38772
 
4.4%
r 38403
 
4.3%
a 33915
 
3.8%
Other values (24) 262169
29.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 883749
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
149945
17.0%
e 83276
 
9.4%
o 75690
 
8.6%
t 56167
 
6.4%
n 50946
 
5.8%
d 47317
 
5.4%
s 47149
 
5.3%
i 38772
 
4.4%
r 38403
 
4.3%
a 33915
 
3.8%
Other values (24) 262169
29.7%
Distinct25
Distinct (%)0.2%
Missing7220
Missing (%)36.6%
Memory size154.2 KiB
$0-999
1513 
10,000-14,999
833 
100,000-124,999
 
750
30,000-39,999
 
728
40,000-49,999
 
719
Other values (20)
7954 

Length

Max length15
Median length13
Mean length12.04361
Min length6

Characters and Unicode

Total characters150509
Distinct characters15
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row30,000-39,999
2nd row5,000-7,499
3rd row250,000-299,999
4th row4,000-4,999
5th row60,000-69,999

Common Values

ValueCountFrequency (%)
$0-999 1513
 
7.7%
10,000-14,999 833
 
4.2%
100,000-124,999 750
 
3.8%
30,000-39,999 728
 
3.7%
40,000-49,999 719
 
3.6%
50,000-59,999 704
 
3.6%
1,000-1,999 599
 
3.0%
60,000-69,999 576
 
2.9%
5,000-7,499 536
 
2.7%
15,000-19,999 529
 
2.7%
Other values (15) 5010
25.4%
(Missing) 7220
36.6%

Length

2024-11-05T21:36:41.550942image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
0-999 1513
 
12.0%
10,000-14,999 833
 
6.6%
100,000-124,999 750
 
6.0%
30,000-39,999 728
 
5.8%
40,000-49,999 719
 
5.7%
50,000-59,999 704
 
5.6%
1,000-1,999 599
 
4.8%
60,000-69,999 576
 
4.6%
5,000-7,499 536
 
4.3%
15,000-19,999 529
 
4.2%
Other values (16) 5093
40.5%

Most occurring characters

ValueCountFrequency (%)
9 44336
29.5%
0 42462
28.2%
, 21885
14.5%
- 12414
 
8.2%
1 7256
 
4.8%
4 5309
 
3.5%
5 4502
 
3.0%
2 4489
 
3.0%
3 2140
 
1.4%
7 1992
 
1.3%
Other values (5) 3724
 
2.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 150509
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
9 44336
29.5%
0 42462
28.2%
, 21885
14.5%
- 12414
 
8.2%
1 7256
 
4.8%
4 5309
 
3.5%
5 4502
 
3.0%
2 4489
 
3.0%
3 2140
 
1.4%
7 1992
 
1.3%
Other values (5) 3724
 
2.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 150509
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
9 44336
29.5%
0 42462
28.2%
, 21885
14.5%
- 12414
 
8.2%
1 7256
 
4.8%
4 5309
 
3.5%
5 4502
 
3.0%
2 4489
 
3.0%
3 2140
 
1.4%
7 1992
 
1.3%
Other values (5) 3724
 
2.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 150509
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
9 44336
29.5%
0 42462
28.2%
, 21885
14.5%
- 12414
 
8.2%
1 7256
 
4.8%
4 5309
 
3.5%
5 4502
 
3.0%
2 4489
 
3.0%
3 2140
 
1.4%
7 1992
 
1.3%
Other values (5) 3724
 
2.5%
Distinct6
Distinct (%)< 0.1%
Missing7467
Missing (%)37.9%
Memory size154.2 KiB
$0 (USD)
4038 
$100-$999
2335 
$1000-$9,999
2123 
$1-$99
1485 
$10,000-$99,999
1268 

Length

Max length17
Median length15
Mean length10.101388
Min length6

Characters and Unicode

Total characters123742
Distinct characters13
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row$0 (USD)
2nd row> $100,000 ($USD)
3rd row$10,000-$99,999
4th row$0 (USD)
5th row$10,000-$99,999

Common Values

ValueCountFrequency (%)
$0 (USD) 4038
20.5%
$100-$999 2335
 
11.8%
$1000-$9,999 2123
 
10.8%
$1-$99 1485
 
7.5%
$10,000-$99,999 1268
 
6.4%
> $100,000 ($USD) 1001
 
5.1%
(Missing) 7467
37.9%

Length

2024-11-05T21:36:41.741149image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-05T21:36:42.058285image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
ValueCountFrequency (%)
usd 5039
27.6%
0 4038
22.1%
100-$999 2335
12.8%
1000-$9,999 2123
11.6%
1-$99 1485
 
8.1%
10,000-$99,999 1268
 
6.9%
1001
 
5.5%
100,000 1001
 
5.5%

Most occurring characters

ValueCountFrequency (%)
0 25154
20.3%
9 24807
20.0%
$ 20462
16.5%
1 8212
 
6.6%
- 7211
 
5.8%
6040
 
4.9%
, 5660
 
4.6%
( 5039
 
4.1%
U 5039
 
4.1%
S 5039
 
4.1%
Other values (3) 11079
9.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 123742
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 25154
20.3%
9 24807
20.0%
$ 20462
16.5%
1 8212
 
6.6%
- 7211
 
5.8%
6040
 
4.9%
, 5660
 
4.6%
( 5039
 
4.1%
U 5039
 
4.1%
S 5039
 
4.1%
Other values (3) 11079
9.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 123742
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 25154
20.3%
9 24807
20.0%
$ 20462
16.5%
1 8212
 
6.6%
- 7211
 
5.8%
6040
 
4.9%
, 5660
 
4.6%
( 5039
 
4.1%
U 5039
 
4.1%
S 5039
 
4.1%
Other values (3) 11079
9.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 123742
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 25154
20.3%
9 24807
20.0%
$ 20462
16.5%
1 8212
 
6.6%
- 7211
 
5.8%
6040
 
4.9%
, 5660
 
4.6%
( 5039
 
4.1%
U 5039
 
4.1%
S 5039
 
4.1%
Other values (3) 11079
9.0%
Distinct4975
Distinct (%)25.2%
Missing0
Missing (%)0.0%
Memory size154.2 KiB
2024-11-05T21:36:42.983931image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Length

Max length89
Median length86
Mean length63.65938
Min length18

Characters and Unicode

Total characters1255172
Distinct characters55
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4355 ?
Unique (%)22.1%

Sample

1st rowBasic statistical software (Microsoft Excel, Google Sheets, etc.), 0, -1, -1, -1, -1
2nd rowCloud-based data software & APIs (AWS, GCP, Azure, etc.), -1, -1, -1, -1, 0
3rd row-1, -1, -1, -1, -1
4th rowLocal development environments (RStudio, JupyterLab, etc.), -1, -1, -1, 0, -1
5th rowLocal development environments (RStudio, JupyterLab, etc.), -1, -1, -1, 1, -1
ValueCountFrequency (%)
1 85278
43.2%
etc 14500
 
7.3%
local 8475
 
4.3%
development 8475
 
4.3%
environments 8475
 
4.3%
rstudio 8475
 
4.3%
jupyterlab 8475
 
4.3%
software 6025
 
3.1%
statistical 3956
 
2.0%
excel 3061
 
1.6%
Other values (2861) 42147
21.4%
2024-11-05T21:36:43.796749image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
177625
14.2%
, 125627
 
10.0%
e 95128
 
7.6%
1 90608
 
7.2%
- 85279
 
6.8%
t 76555
 
6.1%
o 55119
 
4.4%
a 41050
 
3.3%
c 38771
 
3.1%
s 37495
 
3.0%
Other values (45) 431915
34.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1255172
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
177625
14.2%
, 125627
 
10.0%
e 95128
 
7.6%
1 90608
 
7.2%
- 85279
 
6.8%
t 76555
 
6.1%
o 55119
 
4.4%
a 41050
 
3.3%
c 38771
 
3.1%
s 37495
 
3.0%
Other values (45) 431915
34.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1255172
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
177625
14.2%
, 125627
 
10.0%
e 95128
 
7.6%
1 90608
 
7.2%
- 85279
 
6.8%
t 76555
 
6.1%
o 55119
 
4.4%
a 41050
 
3.3%
c 38771
 
3.1%
s 37495
 
3.0%
Other values (45) 431915
34.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1255172
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
177625
14.2%
, 125627
 
10.0%
e 95128
 
7.6%
1 90608
 
7.2%
- 85279
 
6.8%
t 76555
 
6.1%
o 55119
 
4.4%
a 41050
 
3.3%
c 38771
 
3.1%
s 37495
 
3.0%
Other values (45) 431915
34.4%
Distinct7
Distinct (%)< 0.1%
Missing4090
Missing (%)20.7%
Memory size154.2 KiB
1-2 years
4061 
< 1 years
3828 
3-5 years
3365 
5-10 years
1887 
10-20 years
1045 
Other values (2)
1441 

Length

Max length25
Median length9
Mean length10.140142
Min length9

Characters and Unicode

Total characters158460
Distinct characters24
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1-2 years
2nd rowI have never written code
3rd row1-2 years
4th row< 1 years
5th row20+ years

Common Values

ValueCountFrequency (%)
1-2 years 4061
20.6%
< 1 years 3828
19.4%
3-5 years 3365
17.1%
5-10 years 1887
9.6%
10-20 years 1045
 
5.3%
I have never written code 865
 
4.4%
20+ years 576
 
2.9%
(Missing) 4090
20.7%

Length

2024-11-05T21:36:44.033716image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-05T21:36:44.270158image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
ValueCountFrequency (%)
years 14762
39.2%
1-2 4061
 
10.8%
3828
 
10.2%
1 3828
 
10.2%
3-5 3365
 
8.9%
5-10 1887
 
5.0%
10-20 1045
 
2.8%
i 865
 
2.3%
have 865
 
2.3%
never 865
 
2.3%
Other values (3) 2306
 
6.1%

Most occurring characters

ValueCountFrequency (%)
22050
13.9%
e 19087
12.0%
r 16492
10.4%
a 15627
9.9%
y 14762
9.3%
s 14762
9.3%
1 10821
6.8%
- 10358
6.5%
2 5682
 
3.6%
5 5252
 
3.3%
Other values (14) 23567
14.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 158460
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
22050
13.9%
e 19087
12.0%
r 16492
10.4%
a 15627
9.9%
y 14762
9.3%
s 14762
9.3%
1 10821
6.8%
- 10358
6.5%
2 5682
 
3.6%
5 5252
 
3.3%
Other values (14) 23567
14.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 158460
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
22050
13.9%
e 19087
12.0%
r 16492
10.4%
a 15627
9.9%
y 14762
9.3%
s 14762
9.3%
1 10821
6.8%
- 10358
6.5%
2 5682
 
3.6%
5 5252
 
3.3%
Other values (14) 23567
14.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 158460
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
22050
13.9%
e 19087
12.0%
r 16492
10.4%
a 15627
9.9%
y 14762
9.3%
s 14762
9.3%
1 10821
6.8%
- 10358
6.5%
2 5682
 
3.6%
5 5252
 
3.3%
Other values (14) 23567
14.9%
Distinct12
Distinct (%)0.1%
Missing5340
Missing (%)27.1%
Memory size154.2 KiB
Python
11316 
R
1343 
SQL
 
817
C++
 
199
MATLAB
 
162
Other values (7)
 
540

Length

Max length10
Median length6
Mean length5.2444182
Min length1

Characters and Unicode

Total characters75399
Distinct characters27
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPython
2nd rowPython
3rd rowPython
4th rowJava
5th rowPython

Common Values

ValueCountFrequency (%)
Python 11316
57.4%
R 1343
 
6.8%
SQL 817
 
4.1%
C++ 199
 
1.0%
MATLAB 162
 
0.8%
C 153
 
0.8%
Other 127
 
0.6%
Java 104
 
0.5%
None 69
 
0.3%
Javascript 47
 
0.2%
Other values (2) 40
 
0.2%
(Missing) 5340
27.1%

Length

2024-11-05T21:36:44.518376image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
python 11316
78.7%
r 1343
 
9.3%
sql 817
 
5.7%
c 352
 
2.4%
matlab 162
 
1.1%
other 127
 
0.9%
java 104
 
0.7%
none 69
 
0.5%
javascript 47
 
0.3%
bash 35
 
0.2%

Most occurring characters

ValueCountFrequency (%)
t 11495
15.2%
h 11478
15.2%
o 11385
15.1%
n 11385
15.1%
y 11321
15.0%
P 11316
15.0%
R 1343
 
1.8%
L 979
 
1.3%
S 822
 
1.1%
Q 817
 
1.1%
Other values (17) 3058
 
4.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 75399
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
t 11495
15.2%
h 11478
15.2%
o 11385
15.1%
n 11385
15.1%
y 11321
15.0%
P 11316
15.0%
R 1343
 
1.8%
L 979
 
1.3%
S 822
 
1.1%
Q 817
 
1.1%
Other values (17) 3058
 
4.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 75399
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
t 11495
15.2%
h 11478
15.2%
o 11385
15.1%
n 11385
15.1%
y 11321
15.0%
P 11316
15.0%
R 1343
 
1.8%
L 979
 
1.3%
S 822
 
1.1%
Q 817
 
1.1%
Other values (17) 3058
 
4.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 75399
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
t 11495
15.2%
h 11478
15.2%
o 11385
15.1%
n 11385
15.1%
y 11321
15.0%
P 11316
15.0%
R 1343
 
1.8%
L 979
 
1.3%
S 822
 
1.1%
Q 817
 
1.1%
Other values (17) 3058
 
4.1%
Distinct5
Distinct (%)< 0.1%
Missing5514
Missing (%)28.0%
Memory size154.2 KiB
Never
11495 
Once
1320 
2-5 times
 
1037
6-24 times
 
193
> 25 times
 
158

Length

Max length10
Median length5
Mean length5.3226783
Min length4

Characters and Unicode

Total characters75598
Distinct characters18
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNever
2nd rowOnce
3rd rowNever
4th rowNever
5th row6-24 times

Common Values

ValueCountFrequency (%)
Never 11495
58.3%
Once 1320
 
6.7%
2-5 times 1037
 
5.3%
6-24 times 193
 
1.0%
> 25 times 158
 
0.8%
(Missing) 5514
28.0%

Length

2024-11-05T21:36:44.756877image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-05T21:36:45.000069image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
ValueCountFrequency (%)
never 11495
73.0%
times 1388
 
8.8%
once 1320
 
8.4%
2-5 1037
 
6.6%
6-24 193
 
1.2%
158
 
1.0%
25 158
 
1.0%

Most occurring characters

ValueCountFrequency (%)
e 25698
34.0%
N 11495
15.2%
v 11495
15.2%
r 11495
15.2%
1546
 
2.0%
s 1388
 
1.8%
2 1388
 
1.8%
t 1388
 
1.8%
i 1388
 
1.8%
m 1388
 
1.8%
Other values (8) 6929
 
9.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 75598
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 25698
34.0%
N 11495
15.2%
v 11495
15.2%
r 11495
15.2%
1546
 
2.0%
s 1388
 
1.8%
2 1388
 
1.8%
t 1388
 
1.8%
i 1388
 
1.8%
m 1388
 
1.8%
Other values (8) 6929
 
9.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 75598
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 25698
34.0%
N 11495
15.2%
v 11495
15.2%
r 11495
15.2%
1546
 
2.0%
s 1388
 
1.8%
2 1388
 
1.8%
t 1388
 
1.8%
i 1388
 
1.8%
m 1388
 
1.8%
Other values (8) 6929
 
9.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 75598
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 25698
34.0%
N 11495
15.2%
v 11495
15.2%
r 11495
15.2%
1546
 
2.0%
s 1388
 
1.8%
2 1388
 
1.8%
t 1388
 
1.8%
i 1388
 
1.8%
m 1388
 
1.8%
Other values (8) 6929
 
9.2%
Distinct8
Distinct (%)0.1%
Missing5535
Missing (%)28.1%
Memory size154.2 KiB
< 1 years
5149 
1-2 years
3798 
2-3 years
1840 
3-4 years
1080 
4-5 years
927 
Other values (3)
1388 

Length

Max length11
Median length9
Mean length9.1086589
Min length9

Characters and Unicode

Total characters129179
Distinct characters15
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1-2 years
2nd row2-3 years
3rd row< 1 years
4th row10-15 years
5th row2-3 years

Common Values

ValueCountFrequency (%)
< 1 years 5149
26.1%
1-2 years 3798
19.3%
2-3 years 1840
 
9.3%
3-4 years 1080
 
5.5%
4-5 years 927
 
4.7%
5-10 years 869
 
4.4%
10-15 years 336
 
1.7%
20+ years 183
 
0.9%
(Missing) 5535
28.1%

Length

2024-11-05T21:36:45.202199image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-05T21:36:45.480827image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
ValueCountFrequency (%)
years 14182
42.3%
5149
 
15.4%
1 5149
 
15.4%
1-2 3798
 
11.3%
2-3 1840
 
5.5%
3-4 1080
 
3.2%
4-5 927
 
2.8%
5-10 869
 
2.6%
10-15 336
 
1.0%
20 183
 
0.5%

Most occurring characters

ValueCountFrequency (%)
19331
15.0%
y 14182
11.0%
e 14182
11.0%
a 14182
11.0%
r 14182
11.0%
s 14182
11.0%
1 10488
8.1%
- 8850
6.9%
2 5821
 
4.5%
< 5149
 
4.0%
Other values (5) 8630
6.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 129179
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
19331
15.0%
y 14182
11.0%
e 14182
11.0%
a 14182
11.0%
r 14182
11.0%
s 14182
11.0%
1 10488
8.1%
- 8850
6.9%
2 5821
 
4.5%
< 5149
 
4.0%
Other values (5) 8630
6.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 129179
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
19331
15.0%
y 14182
11.0%
e 14182
11.0%
a 14182
11.0%
r 14182
11.0%
s 14182
11.0%
1 10488
8.1%
- 8850
6.9%
2 5821
 
4.5%
< 5149
 
4.0%
Other values (5) 8630
6.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 129179
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
19331
15.0%
y 14182
11.0%
e 14182
11.0%
a 14182
11.0%
r 14182
11.0%
s 14182
11.0%
1 10488
8.1%
- 8850
6.9%
2 5821
 
4.5%
< 5149
 
4.0%
Other values (5) 8630
6.7%
Distinct98
Distinct (%)1.1%
Missing10491
Missing (%)53.2%
Memory size154.2 KiB
2024-11-05T21:36:46.088133image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Length

Max length485
Median length364
Mean length207.43822
Min length5

Characters and Unicode

Total characters1913825
Distinct characters35
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique13 ?
Unique (%)0.1%

Sample

1st rowAnalyze and understand data to influence product or business decisions, Build and/or run the data infrastructure that my business uses for storing, analyzing, and operationalizing data, Build prototypes to explore applying machine learning to new areas, Build and/or run a machine learning service that operationally improves my product or workflows
2nd rowBuild prototypes to explore applying machine learning to new areas, Do research that advances the state of the art of machine learning
3rd rowAnalyze and understand data to influence product or business decisions, Experimentation and iteration to improve existing ML models, Do research that advances the state of the art of machine learning
4th rowAnalyze and understand data to influence product or business decisions, Build prototypes to explore applying machine learning to new areas, Build and/or run a machine learning service that operationally improves my product or workflows
5th rowOther
ValueCountFrequency (%)
to 19758
 
7.1%
and 13362
 
4.8%
data 13223
 
4.7%
build 11895
 
4.3%
machine 10688
 
3.8%
learning 10688
 
3.8%
business 9657
 
3.5%
product 9439
 
3.4%
or 9439
 
3.4%
that 9273
 
3.3%
Other values (47) 162326
58.0%
2024-11-05T21:36:47.051828image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
270522
14.1%
n 158935
 
8.3%
a 154761
 
8.1%
e 153868
 
8.0%
t 132483
 
6.9%
i 125681
 
6.6%
o 122671
 
6.4%
r 120312
 
6.3%
s 97063
 
5.1%
d 79170
 
4.1%
Other values (25) 498359
26.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1913825
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
270522
14.1%
n 158935
 
8.3%
a 154761
 
8.1%
e 153868
 
8.0%
t 132483
 
6.9%
i 125681
 
6.6%
o 122671
 
6.4%
r 120312
 
6.3%
s 97063
 
5.1%
d 79170
 
4.1%
Other values (25) 498359
26.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1913825
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
270522
14.1%
n 158935
 
8.3%
a 154761
 
8.1%
e 153868
 
8.0%
t 132483
 
6.9%
i 125681
 
6.6%
o 122671
 
6.4%
r 120312
 
6.3%
s 97063
 
5.1%
d 79170
 
4.1%
Other values (25) 498359
26.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1913825
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
270522
14.1%
n 158935
 
8.3%
a 154761
 
8.1%
e 153868
 
8.0%
t 132483
 
6.9%
i 125681
 
6.6%
o 122671
 
6.4%
r 120312
 
6.3%
s 97063
 
5.1%
d 79170
 
4.1%
Other values (25) 498359
26.0%
Distinct1020
Distinct (%)6.1%
Missing2936
Missing (%)14.9%
Memory size154.2 KiB
2024-11-05T21:36:47.503844image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Length

Max length512
Median length412
Mean length153.9391
Min length4

Characters and Unicode

Total characters2583252
Distinct characters49
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique313 ?
Unique (%)1.9%

Sample

1st rowTwitter (data science influencers), Kaggle (forums, blog, social media, etc), Blogs (Towards Data Science, Medium, Analytics Vidhya, KDnuggets etc), Journal Publications (traditional publications, preprint journals, etc)
2nd rowKaggle (forums, blog, social media, etc), YouTube (Cloud AI Adventures, Siraj Raval, etc), Podcasts (Chai Time Data Science, Linear Digressions, etc), Blogs (Towards Data Science, Medium, Analytics Vidhya, KDnuggets etc)
3rd rowPodcasts (Chai Time Data Science, Linear Digressions, etc), Blogs (Towards Data Science, Medium, Analytics Vidhya, KDnuggets etc), Journal Publications (traditional publications, preprint journals, etc), Slack Communities (ods.ai, kagglenoobs, etc)
4th rowYouTube (Cloud AI Adventures, Siraj Raval, etc), Blogs (Towards Data Science, Medium, Analytics Vidhya, KDnuggets etc), Other
5th rowYouTube (Cloud AI Adventures, Siraj Raval, etc), Blogs (Towards Data Science, Medium, Analytics Vidhya, KDnuggets etc), Journal Publications (traditional publications, preprint journals, etc)
ValueCountFrequency (%)
etc 44329
 
14.0%
data 15722
 
5.0%
science 15722
 
5.0%
forums 14506
 
4.6%
kaggle 10751
 
3.4%
blog 10751
 
3.4%
social 10751
 
3.4%
media 10751
 
3.4%
kdnuggets 9907
 
3.1%
vidhya 9907
 
3.1%
Other values (36) 164129
51.7%
2024-11-05T21:36:48.547587image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
300445
 
11.6%
e 194465
 
7.5%
a 181119
 
7.0%
i 150113
 
5.8%
, 140412
 
5.4%
t 138919
 
5.4%
s 131026
 
5.1%
c 129271
 
5.0%
o 120553
 
4.7%
n 104219
 
4.0%
Other values (39) 992710
38.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2583252
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
300445
 
11.6%
e 194465
 
7.5%
a 181119
 
7.0%
i 150113
 
5.8%
, 140412
 
5.4%
t 138919
 
5.4%
s 131026
 
5.1%
c 129271
 
5.0%
o 120553
 
4.7%
n 104219
 
4.0%
Other values (39) 992710
38.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2583252
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
300445
 
11.6%
e 194465
 
7.5%
a 181119
 
7.0%
i 150113
 
5.8%
, 140412
 
5.4%
t 138919
 
5.4%
s 131026
 
5.1%
c 129271
 
5.0%
o 120553
 
4.7%
n 104219
 
4.0%
Other values (39) 992710
38.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2583252
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
300445
 
11.6%
e 194465
 
7.5%
a 181119
 
7.0%
i 150113
 
5.8%
, 140412
 
5.4%
t 138919
 
5.4%
s 131026
 
5.1%
c 129271
 
5.0%
o 120553
 
4.7%
n 104219
 
4.0%
Other values (39) 992710
38.4%
Distinct819
Distinct (%)4.9%
Missing3148
Missing (%)16.0%
Memory size154.2 KiB
2024-11-05T21:36:49.147372image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Length

Max length176
Median length151
Mean length40.252942
Min length3

Characters and Unicode

Total characters666951
Distinct characters35
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique262 ?
Unique (%)1.6%

Sample

1st rowCoursera, DataCamp, Kaggle Courses (i.e. Kaggle Learn), Udemy
2nd rowCoursera, DataCamp, Kaggle Courses (i.e. Kaggle Learn), Udemy
3rd rowCoursera, edX, DataCamp, University Courses (resulting in a university degree)
4th rowOther
5th rowNone
ValueCountFrequency (%)
kaggle 10238
11.6%
courses 9597
10.8%
university 8956
 
10.1%
coursera 8685
 
9.8%
i.e 5119
 
5.8%
learn 5119
 
5.8%
udemy 4804
 
5.4%
resulting 4478
 
5.1%
in 4478
 
5.1%
a 4478
 
5.1%
Other values (10) 22616
25.5%
2024-11-05T21:36:50.328129image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 80252
 
12.0%
71999
 
10.8%
r 53153
 
8.0%
a 48811
 
7.3%
s 43576
 
6.5%
i 39026
 
5.9%
g 30715
 
4.6%
n 29654
 
4.4%
u 27981
 
4.2%
t 25108
 
3.8%
Other values (25) 216676
32.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 666951
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 80252
 
12.0%
71999
 
10.8%
r 53153
 
8.0%
a 48811
 
7.3%
s 43576
 
6.5%
i 39026
 
5.9%
g 30715
 
4.6%
n 29654
 
4.4%
u 27981
 
4.2%
t 25108
 
3.8%
Other values (25) 216676
32.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 666951
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 80252
 
12.0%
71999
 
10.8%
r 53153
 
8.0%
a 48811
 
7.3%
s 43576
 
6.5%
i 39026
 
5.9%
g 30715
 
4.6%
n 29654
 
4.4%
u 27981
 
4.2%
t 25108
 
3.8%
Other values (25) 216676
32.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 666951
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 80252
 
12.0%
71999
 
10.8%
r 53153
 
8.0%
a 48811
 
7.3%
s 43576
 
6.5%
i 39026
 
5.9%
g 30715
 
4.6%
n 29654
 
4.4%
u 27981
 
4.2%
t 25108
 
3.8%
Other values (25) 216676
32.5%
Distinct853
Distinct (%)5.8%
Missing5090
Missing (%)25.8%
Memory size154.2 KiB
2024-11-05T21:36:50.799117image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Length

Max length185
Median length160
Mean length64.706023
Min length4

Characters and Unicode

Total characters946455
Distinct characters39
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique274 ?
Unique (%)1.9%

Sample

1st rowJupyter (JupyterLab, Jupyter Notebooks, etc) , RStudio , PyCharm , MATLAB , Spyder
2nd rowJupyter (JupyterLab, Jupyter Notebooks, etc) , Visual Studio / Visual Studio Code
3rd rowJupyter (JupyterLab, Jupyter Notebooks, etc)
4th row RStudio , Other
5th rowJupyter (JupyterLab, Jupyter Notebooks, etc) , Spyder , Notepad++ , Sublime Text
ValueCountFrequency (%)
30693
22.7%
jupyter 21608
15.9%
notebooks 10804
 
8.0%
etc 10804
 
8.0%
jupyterlab 10804
 
8.0%
visual 9068
 
6.7%
studio 9068
 
6.7%
code 4534
 
3.3%
rstudio 4455
 
3.3%
pycharm 4224
 
3.1%
Other values (10) 19424
14.3%
2024-11-05T21:36:51.915972image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
183434
19.4%
t 75657
 
8.0%
e 71159
 
7.5%
u 57658
 
6.1%
o 55475
 
5.9%
, 45985
 
4.9%
r 40412
 
4.3%
y 39721
 
4.2%
p 38778
 
4.1%
J 32412
 
3.4%
Other values (29) 305764
32.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 946455
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
183434
19.4%
t 75657
 
8.0%
e 71159
 
7.5%
u 57658
 
6.1%
o 55475
 
5.9%
, 45985
 
4.9%
r 40412
 
4.3%
y 39721
 
4.2%
p 38778
 
4.1%
J 32412
 
3.4%
Other values (29) 305764
32.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 946455
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
183434
19.4%
t 75657
 
8.0%
e 71159
 
7.5%
u 57658
 
6.1%
o 55475
 
5.9%
, 45985
 
4.9%
r 40412
 
4.3%
y 39721
 
4.2%
p 38778
 
4.1%
J 32412
 
3.4%
Other values (29) 305764
32.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 946455
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
183434
19.4%
t 75657
 
8.0%
e 71159
 
7.5%
u 57658
 
6.1%
o 55475
 
5.9%
, 45985
 
4.9%
r 40412
 
4.3%
y 39721
 
4.2%
p 38778
 
4.1%
J 32412
 
3.4%
Other values (29) 305764
32.3%
Distinct248
Distinct (%)1.7%
Missing5274
Missing (%)26.7%
Memory size154.2 KiB
2024-11-05T21:36:52.312733image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Length

Max length295
Median length254
Mean length29.514851
Min length4

Characters and Unicode

Total characters426283
Distinct characters44
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique100 ?
Unique (%)0.7%

Sample

1st rowNone
2nd row Microsoft Azure Notebooks
3rd row Google Colab , Google Cloud Notebook Products (AI Platform, Datalab, etc)
4th rowNone
5th row Kaggle Notebooks (Kernels) , Google Colab , Binder / JupyterHub
ValueCountFrequency (%)
7815
12.9%
notebooks 7214
11.9%
google 5672
 
9.4%
none 5177
 
8.5%
kernels 4845
 
8.0%
kaggle 4845
 
8.0%
colab 4551
 
7.5%
products 1878
 
3.1%
etc 1878
 
3.1%
notebook 1878
 
3.1%
Other values (20) 14831
24.5%
2024-11-05T21:36:52.969959image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
68912
16.2%
o 55693
13.1%
e 43123
 
10.1%
l 23377
 
5.5%
t 19566
 
4.6%
a 16565
 
3.9%
b 16546
 
3.9%
g 16119
 
3.8%
s 15603
 
3.7%
r 14417
 
3.4%
Other values (34) 136362
32.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 426283
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
68912
16.2%
o 55693
13.1%
e 43123
 
10.1%
l 23377
 
5.5%
t 19566
 
4.6%
a 16565
 
3.9%
b 16546
 
3.9%
g 16119
 
3.8%
s 15603
 
3.7%
r 14417
 
3.4%
Other values (34) 136362
32.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 426283
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
68912
16.2%
o 55693
13.1%
e 43123
 
10.1%
l 23377
 
5.5%
t 19566
 
4.6%
a 16565
 
3.9%
b 16546
 
3.9%
g 16119
 
3.8%
s 15603
 
3.7%
r 14417
 
3.4%
Other values (34) 136362
32.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 426283
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
68912
16.2%
o 55693
13.1%
e 43123
 
10.1%
l 23377
 
5.5%
t 19566
 
4.6%
a 16565
 
3.9%
b 16546
 
3.9%
g 16119
 
3.8%
s 15603
 
3.7%
r 14417
 
3.4%
Other values (34) 136362
32.0%
Distinct611
Distinct (%)4.2%
Missing5313
Missing (%)26.9%
Memory size154.2 KiB
2024-11-05T21:36:53.240134image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Length

Max length70
Median length60
Mean length14.848792
Min length1

Characters and Unicode

Total characters213882
Distinct characters29
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique215 ?
Unique (%)1.5%

Sample

1st rowPython, R, SQL, Java, Javascript, MATLAB
2nd rowPython, R, SQL, Bash
3rd rowPython, SQL
4th rowPython, R
5th rowPython, R, Bash
ValueCountFrequency (%)
python 12841
34.2%
sql 6532
17.4%
r 4588
 
12.2%
c 3928
 
10.5%
java 2267
 
6.0%
javascript 2174
 
5.8%
bash 2037
 
5.4%
matlab 1516
 
4.0%
other 1148
 
3.1%
typescript 389
 
1.0%
2024-11-05T21:36:54.069247image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
, 23099
 
10.8%
23099
 
10.8%
t 16552
 
7.7%
h 16026
 
7.5%
y 13230
 
6.2%
o 12924
 
6.0%
n 12924
 
6.0%
P 12841
 
6.0%
a 10919
 
5.1%
L 8048
 
3.8%
Other values (19) 64220
30.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 213882
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
, 23099
 
10.8%
23099
 
10.8%
t 16552
 
7.7%
h 16026
 
7.5%
y 13230
 
6.2%
o 12924
 
6.0%
n 12924
 
6.0%
P 12841
 
6.0%
a 10919
 
5.1%
L 8048
 
3.8%
Other values (19) 64220
30.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 213882
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
, 23099
 
10.8%
23099
 
10.8%
t 16552
 
7.7%
h 16026
 
7.5%
y 13230
 
6.2%
o 12924
 
6.0%
n 12924
 
6.0%
P 12841
 
6.0%
a 10919
 
5.1%
L 8048
 
3.8%
Other values (19) 64220
30.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 213882
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
, 23099
 
10.8%
23099
 
10.8%
t 16552
 
7.7%
h 16026
 
7.5%
y 13230
 
6.2%
o 12924
 
6.0%
n 12924
 
6.0%
P 12841
 
6.0%
a 10919
 
5.1%
L 8048
 
3.8%
Other values (19) 64220
30.0%
Distinct439
Distinct (%)3.1%
Missing5464
Missing (%)27.7%
Memory size154.2 KiB
2024-11-05T21:36:54.396190image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Length

Max length141
Median length130
Mean length30.0174
Min length4

Characters and Unicode

Total characters427838
Distinct characters38
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique165 ?
Unique (%)1.2%

Sample

1st row Matplotlib
2nd row Ggplot / ggplot2 , Matplotlib , Seaborn
3rd row Matplotlib , Plotly / Plotly Express , Seaborn
4th row Ggplot / ggplot2
5th row Matplotlib , Plotly / Plotly Express , Bokeh , Seaborn
ValueCountFrequency (%)
24947
37.0%
matplotlib 10516
15.6%
seaborn 6905
 
10.3%
plotly 6434
 
9.6%
ggplot 4182
 
6.2%
ggplot2 4182
 
6.2%
express 3217
 
4.8%
shiny 1244
 
1.8%
none 1240
 
1.8%
d3.js 1078
 
1.6%
Other values (6) 3419
 
5.1%
2024-11-05T21:36:55.074222image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
95205
22.3%
l 44819
10.5%
t 37656
 
8.8%
o 36340
 
8.5%
p 22741
 
5.3%
a 18138
 
4.2%
b 18065
 
4.2%
, 16998
 
4.0%
e 14614
 
3.4%
i 13121
 
3.1%
Other values (28) 110141
25.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 427838
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
95205
22.3%
l 44819
10.5%
t 37656
 
8.8%
o 36340
 
8.5%
p 22741
 
5.3%
a 18138
 
4.2%
b 18065
 
4.2%
, 16998
 
4.0%
e 14614
 
3.4%
i 13121
 
3.1%
Other values (28) 110141
25.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 427838
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
95205
22.3%
l 44819
10.5%
t 37656
 
8.8%
o 36340
 
8.5%
p 22741
 
5.3%
a 18138
 
4.2%
b 18065
 
4.2%
, 16998
 
4.0%
e 14614
 
3.4%
i 13121
 
3.1%
Other values (28) 110141
25.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 427838
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
95205
22.3%
l 44819
10.5%
t 37656
 
8.8%
o 36340
 
8.5%
p 22741
 
5.3%
a 18138
 
4.2%
b 18065
 
4.2%
, 16998
 
4.0%
e 14614
 
3.4%
i 13121
 
3.1%
Other values (28) 110141
25.7%
Distinct14
Distinct (%)0.1%
Missing5499
Missing (%)27.9%
Memory size154.2 KiB
CPUs, GPUs
5041 
CPUs
5001 
None / I do not know
2449 
GPUs
1129 
CPUs, GPUs, TPUs
 
348
Other values (9)
 
250

Length

Max length23
Median length20
Mean length9.2723308
Min length4

Characters and Unicode

Total characters131834
Distinct characters21
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowCPUs, GPUs
2nd rowCPUs, GPUs
3rd rowCPUs, GPUs
4th rowCPUs, GPUs
5th rowCPUs, GPUs

Common Values

ValueCountFrequency (%)
CPUs, GPUs 5041
25.6%
CPUs 5001
25.4%
None / I do not know 2449
12.4%
GPUs 1129
 
5.7%
CPUs, GPUs, TPUs 348
 
1.8%
GPUs, TPUs 82
 
0.4%
Other 50
 
0.3%
TPUs 30
 
0.2%
CPUs, TPUs 30
 
0.2%
CPUs, GPUs, Other 27
 
0.1%
Other values (4) 31
 
0.2%
(Missing) 5499
27.9%

Length

2024-11-05T21:36:55.383644image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
cpus 10472
32.3%
gpus 6638
20.5%
none 2449
 
7.6%
2449
 
7.6%
i 2449
 
7.6%
do 2449
 
7.6%
not 2449
 
7.6%
know 2449
 
7.6%
tpus 496
 
1.5%
other 108
 
0.3%

Most occurring characters

ValueCountFrequency (%)
18190
13.8%
P 17606
13.4%
U 17606
13.4%
s 17606
13.4%
C 10472
7.9%
o 9796
7.4%
n 7347
5.6%
G 6638
 
5.0%
, 5945
 
4.5%
t 2557
 
1.9%
Other values (11) 18071
13.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 131834
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
18190
13.8%
P 17606
13.4%
U 17606
13.4%
s 17606
13.4%
C 10472
7.9%
o 9796
7.4%
n 7347
5.6%
G 6638
 
5.0%
, 5945
 
4.5%
t 2557
 
1.9%
Other values (11) 18071
13.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 131834
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
18190
13.8%
P 17606
13.4%
U 17606
13.4%
s 17606
13.4%
C 10472
7.9%
o 9796
7.4%
n 7347
5.6%
G 6638
 
5.0%
, 5945
 
4.5%
t 2557
 
1.9%
Other values (11) 18071
13.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 131834
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
18190
13.8%
P 17606
13.4%
U 17606
13.4%
s 17606
13.4%
C 10472
7.9%
o 9796
7.4%
n 7347
5.6%
G 6638
 
5.0%
, 5945
 
4.5%
t 2557
 
1.9%
Other values (11) 18071
13.7%
Distinct684
Distinct (%)4.9%
Missing5629
Missing (%)28.5%
Memory size154.2 KiB
2024-11-05T21:36:55.761655image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Length

Max length336
Median length288
Mean length101.29813
Min length4

Characters and Unicode

Total characters1427088
Distinct characters43
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique232 ?
Unique (%)1.6%

Sample

1st rowLinear or Logistic Regression
2nd rowLinear or Logistic Regression, Convolutional Neural Networks
3rd rowLinear or Logistic Regression, Decision Trees or Random Forests, Gradient Boosting Machines (xgboost, lightgbm, etc)
4th rowLinear or Logistic Regression, Decision Trees or Random Forests, Gradient Boosting Machines (xgboost, lightgbm, etc), Bayesian Approaches, Convolutional Neural Networks, Generative Adversarial Networks, Recurrent Neural Networks
5th rowLinear or Logistic Regression, Dense Neural Networks (MLPs, etc), Convolutional Neural Networks, Recurrent Neural Networks
ValueCountFrequency (%)
or 18713
 
10.5%
networks 14046
 
7.9%
neural 12162
 
6.8%
linear 10223
 
5.7%
logistic 10223
 
5.7%
regression 10223
 
5.7%
etc 9769
 
5.5%
decision 8490
 
4.8%
trees 8490
 
4.8%
random 8490
 
4.8%
Other values (20) 67134
37.7%
2024-11-05T21:36:56.445319image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
163875
 
11.5%
e 139846
 
9.8%
o 125597
 
8.8%
s 112138
 
7.9%
r 106194
 
7.4%
i 92023
 
6.4%
n 79316
 
5.6%
t 76694
 
5.4%
a 64148
 
4.5%
, 46623
 
3.3%
Other values (33) 420634
29.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1427088
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
163875
 
11.5%
e 139846
 
9.8%
o 125597
 
8.8%
s 112138
 
7.9%
r 106194
 
7.4%
i 92023
 
6.4%
n 79316
 
5.6%
t 76694
 
5.4%
a 64148
 
4.5%
, 46623
 
3.3%
Other values (33) 420634
29.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1427088
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
163875
 
11.5%
e 139846
 
9.8%
o 125597
 
8.8%
s 112138
 
7.9%
r 106194
 
7.4%
i 92023
 
6.4%
n 79316
 
5.6%
t 76694
 
5.4%
a 64148
 
4.5%
, 46623
 
3.3%
Other values (33) 420634
29.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1427088
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
163875
 
11.5%
e 139846
 
9.8%
o 125597
 
8.8%
s 112138
 
7.9%
r 106194
 
7.4%
i 92023
 
6.4%
n 79316
 
5.6%
t 76694
 
5.4%
a 64148
 
4.5%
, 46623
 
3.3%
Other values (33) 420634
29.5%
Distinct92
Distinct (%)0.7%
Missing5802
Missing (%)29.4%
Memory size154.2 KiB
2024-11-05T21:36:56.795038image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Length

Max length374
Median length4
Mean length44.514625
Min length4

Characters and Unicode

Total characters619421
Distinct characters41
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique15 ?
Unique (%)0.1%

Sample

1st rowNone
2nd rowAutomation of full ML pipelines (e.g. Google AutoML, H20 Driverless AI)
3rd rowNone
4th rowAutomated model selection (e.g. auto-sklearn, xcessiv), Automated hyperparameter tuning (e.g. hyperopt, ray.tune), Automation of full ML pipelines (e.g. Google AutoML, H20 Driverless AI)
5th rowAutomated data augmentation (e.g. imgaug, albumentations), Automated feature engineering/selection (e.g. tpot, boruta_py), Automated model selection (e.g. auto-sklearn, xcessiv), Automation of full ML pipelines (e.g. Google AutoML, H20 Driverless AI)
ValueCountFrequency (%)
e.g 9911
 
13.4%
automated 8733
 
11.8%
none 7822
 
10.6%
model 3650
 
4.9%
selection 3200
 
4.3%
auto-sklearn 3200
 
4.3%
xcessiv 3200
 
4.3%
data 1800
 
2.4%
augmentation 1800
 
2.4%
imgaug 1800
 
2.4%
Other values (24) 28741
38.9%
2024-11-05T21:36:57.408903image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 74310
 
12.0%
59942
 
9.7%
t 52616
 
8.5%
o 43566
 
7.0%
a 39055
 
6.3%
n 35582
 
5.7%
u 27883
 
4.5%
i 23255
 
3.8%
. 21600
 
3.5%
s 21439
 
3.5%
Other values (31) 220173
35.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 619421
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 74310
 
12.0%
59942
 
9.7%
t 52616
 
8.5%
o 43566
 
7.0%
a 39055
 
6.3%
n 35582
 
5.7%
u 27883
 
4.5%
i 23255
 
3.8%
. 21600
 
3.5%
s 21439
 
3.5%
Other values (31) 220173
35.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 619421
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 74310
 
12.0%
59942
 
9.7%
t 52616
 
8.5%
o 43566
 
7.0%
a 39055
 
6.3%
n 35582
 
5.7%
u 27883
 
4.5%
i 23255
 
3.8%
. 21600
 
3.5%
s 21439
 
3.5%
Other values (31) 220173
35.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 619421
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 74310
 
12.0%
59942
 
9.7%
t 52616
 
8.5%
o 43566
 
7.0%
a 39055
 
6.3%
n 35582
 
5.7%
u 27883
 
4.5%
i 23255
 
3.8%
. 21600
 
3.5%
s 21439
 
3.5%
Other values (31) 220173
35.5%
Distinct49
Distinct (%)0.9%
Missing14225
Missing (%)72.1%
Memory size154.2 KiB
None
1203 
Image classification and other general purpose networks (VGG, Inception, ResNet, ResNeXt, NASNet, EfficientNet, etc)
560 
General purpose image/video tools (PIL, cv2, skimage, etc), Image classification and other general purpose networks (VGG, Inception, ResNet, ResNeXt, NASNet, EfficientNet, etc)
366 
General purpose image/video tools (PIL, cv2, skimage, etc), Image segmentation methods (U-Net, Mask R-CNN, etc), Object detection methods (YOLOv3, RetinaNet, etc), Image classification and other general purpose networks (VGG, Inception, ResNet, ResNeXt, NASNet, EfficientNet, etc), Generative Networks (GAN, VAE, etc)
341 
General purpose image/video tools (PIL, cv2, skimage, etc), Image segmentation methods (U-Net, Mask R-CNN, etc), Object detection methods (YOLOv3, RetinaNet, etc), Image classification and other general purpose networks (VGG, Inception, ResNet, ResNeXt, NASNet, EfficientNet, etc)
326 
Other values (44)
2696 

Length

Max length324
Median length271
Mean length136.52203
Min length4

Characters and Unicode

Total characters749779
Distinct characters46
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9 ?
Unique (%)0.2%

Sample

1st rowGeneral purpose image/video tools (PIL, cv2, skimage, etc), Image classification and other general purpose networks (VGG, Inception, ResNet, ResNeXt, NASNet, EfficientNet, etc)
2nd rowNone
3rd rowGeneral purpose image/video tools (PIL, cv2, skimage, etc), Image segmentation methods (U-Net, Mask R-CNN, etc), Object detection methods (YOLOv3, RetinaNet, etc)
4th rowGeneral purpose image/video tools (PIL, cv2, skimage, etc), Image classification and other general purpose networks (VGG, Inception, ResNet, ResNeXt, NASNet, EfficientNet, etc)
5th rowGeneral purpose image/video tools (PIL, cv2, skimage, etc), Image classification and other general purpose networks (VGG, Inception, ResNet, ResNeXt, NASNet, EfficientNet, etc)

Common Values

ValueCountFrequency (%)
None 1203
 
6.1%
Image classification and other general purpose networks (VGG, Inception, ResNet, ResNeXt, NASNet, EfficientNet, etc) 560
 
2.8%
General purpose image/video tools (PIL, cv2, skimage, etc), Image classification and other general purpose networks (VGG, Inception, ResNet, ResNeXt, NASNet, EfficientNet, etc) 366
 
1.9%
General purpose image/video tools (PIL, cv2, skimage, etc), Image segmentation methods (U-Net, Mask R-CNN, etc), Object detection methods (YOLOv3, RetinaNet, etc), Image classification and other general purpose networks (VGG, Inception, ResNet, ResNeXt, NASNet, EfficientNet, etc), Generative Networks (GAN, VAE, etc) 341
 
1.7%
General purpose image/video tools (PIL, cv2, skimage, etc), Image segmentation methods (U-Net, Mask R-CNN, etc), Object detection methods (YOLOv3, RetinaNet, etc), Image classification and other general purpose networks (VGG, Inception, ResNet, ResNeXt, NASNet, EfficientNet, etc) 326
 
1.7%
General purpose image/video tools (PIL, cv2, skimage, etc), Image segmentation methods (U-Net, Mask R-CNN, etc), Image classification and other general purpose networks (VGG, Inception, ResNet, ResNeXt, NASNet, EfficientNet, etc) 243
 
1.2%
Image segmentation methods (U-Net, Mask R-CNN, etc), Image classification and other general purpose networks (VGG, Inception, ResNet, ResNeXt, NASNet, EfficientNet, etc) 237
 
1.2%
General purpose image/video tools (PIL, cv2, skimage, etc) 233
 
1.2%
Image segmentation methods (U-Net, Mask R-CNN, etc), Object detection methods (YOLOv3, RetinaNet, etc), Image classification and other general purpose networks (VGG, Inception, ResNet, ResNeXt, NASNet, EfficientNet, etc) 229
 
1.2%
General purpose image/video tools (PIL, cv2, skimage, etc), Object detection methods (YOLOv3, RetinaNet, etc), Image classification and other general purpose networks (VGG, Inception, ResNet, ResNeXt, NASNet, EfficientNet, etc) 224
 
1.1%
Other values (39) 1530
 
7.8%
(Missing) 14225
72.1%

Length

2024-11-05T21:36:57.668886image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
etc 10408
 
11.0%
general 5394
 
5.7%
purpose 5394
 
5.7%
image 5248
 
5.5%
networks 4268
 
4.5%
methods 3933
 
4.2%
other 3238
 
3.4%
and 3187
 
3.4%
classification 3187
 
3.4%
inception 3187
 
3.4%
Other values (22) 47148
49.8%

Most occurring characters

ValueCountFrequency (%)
e 95383
 
12.7%
89100
 
11.9%
t 62987
 
8.4%
, 41941
 
5.6%
o 34913
 
4.7%
s 34879
 
4.7%
n 34666
 
4.6%
i 32629
 
4.4%
a 31692
 
4.2%
c 29107
 
3.9%
Other values (36) 262482
35.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 749779
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 95383
 
12.7%
89100
 
11.9%
t 62987
 
8.4%
, 41941
 
5.6%
o 34913
 
4.7%
s 34879
 
4.7%
n 34666
 
4.6%
i 32629
 
4.4%
a 31692
 
4.2%
c 29107
 
3.9%
Other values (36) 262482
35.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 749779
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 95383
 
12.7%
89100
 
11.9%
t 62987
 
8.4%
, 41941
 
5.6%
o 34913
 
4.7%
s 34879
 
4.7%
n 34666
 
4.6%
i 32629
 
4.4%
a 31692
 
4.2%
c 29107
 
3.9%
Other values (36) 262482
35.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 749779
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 95383
 
12.7%
89100
 
11.9%
t 62987
 
8.4%
, 41941
 
5.6%
o 34913
 
4.7%
s 34879
 
4.7%
n 34666
 
4.6%
i 32629
 
4.4%
a 31692
 
4.2%
c 29107
 
3.9%
Other values (36) 262482
35.0%
Distinct28
Distinct (%)0.8%
Missing16135
Missing (%)81.8%
Memory size154.2 KiB
None
1027 
Word embeddings/vectors (GLoVe, fastText, word2vec)
616 
Word embeddings/vectors (GLoVe, fastText, word2vec), Encoder-decorder models (seq2seq, vanilla transformers)
498 
Word embeddings/vectors (GLoVe, fastText, word2vec), Encoder-decorder models (seq2seq, vanilla transformers), Contextualized embeddings (ELMo, CoVe), Transformer language models (GPT-2, BERT, XLnet, etc)
268 
Word embeddings/vectors (GLoVe, fastText, word2vec), Encoder-decorder models (seq2seq, vanilla transformers), Transformer language models (GPT-2, BERT, XLnet, etc)
250 
Other values (23)
923 

Length

Max length210
Median length170
Mean length74.985204
Min length4

Characters and Unicode

Total characters268597
Distinct characters43
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)0.1%

Sample

1st rowWord embeddings/vectors (GLoVe, fastText, word2vec), Encoder-decorder models (seq2seq, vanilla transformers)
2nd rowWord embeddings/vectors (GLoVe, fastText, word2vec), Encoder-decorder models (seq2seq, vanilla transformers)
3rd rowWord embeddings/vectors (GLoVe, fastText, word2vec)
4th rowWord embeddings/vectors (GLoVe, fastText, word2vec), Encoder-decorder models (seq2seq, vanilla transformers), Contextualized embeddings (ELMo, CoVe), Transformer language models (GPT-2, BERT, XLnet, etc)
5th rowNone

Common Values

ValueCountFrequency (%)
None 1027
 
5.2%
Word embeddings/vectors (GLoVe, fastText, word2vec) 616
 
3.1%
Word embeddings/vectors (GLoVe, fastText, word2vec), Encoder-decorder models (seq2seq, vanilla transformers) 498
 
2.5%
Word embeddings/vectors (GLoVe, fastText, word2vec), Encoder-decorder models (seq2seq, vanilla transformers), Contextualized embeddings (ELMo, CoVe), Transformer language models (GPT-2, BERT, XLnet, etc) 268
 
1.4%
Word embeddings/vectors (GLoVe, fastText, word2vec), Encoder-decorder models (seq2seq, vanilla transformers), Transformer language models (GPT-2, BERT, XLnet, etc) 250
 
1.3%
Word embeddings/vectors (GLoVe, fastText, word2vec), Transformer language models (GPT-2, BERT, XLnet, etc) 230
 
1.2%
Encoder-decorder models (seq2seq, vanilla transformers) 188
 
1.0%
Transformer language models (GPT-2, BERT, XLnet, etc) 115
 
0.6%
Word embeddings/vectors (GLoVe, fastText, word2vec), Contextualized embeddings (ELMo, CoVe), Transformer language models (GPT-2, BERT, XLnet, etc) 79
 
0.4%
Word embeddings/vectors (GLoVe, fastText, word2vec), Encoder-decorder models (seq2seq, vanilla transformers), Contextualized embeddings (ELMo, CoVe) 76
 
0.4%
Other values (18) 235
 
1.2%
(Missing) 16135
81.8%

Length

2024-11-05T21:36:57.916706image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
models 2399
 
8.6%
embeddings/vectors 2115
 
7.6%
glove 2115
 
7.6%
fasttext 2115
 
7.6%
word2vec 2115
 
7.6%
word 2115
 
7.6%
seq2seq 1368
 
4.9%
encoder-decorder 1368
 
4.9%
vanilla 1368
 
4.9%
transformers 1368
 
4.9%
Other values (12) 9510
34.0%

Most occurring characters

ValueCountFrequency (%)
e 31307
 
11.7%
24374
 
9.1%
o 18707
 
7.0%
r 17695
 
6.6%
d 16649
 
6.2%
s 15809
 
5.9%
, 11823
 
4.4%
n 11463
 
4.3%
t 10948
 
4.1%
a 9874
 
3.7%
Other values (33) 99948
37.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 268597
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 31307
 
11.7%
24374
 
9.1%
o 18707
 
7.0%
r 17695
 
6.6%
d 16649
 
6.2%
s 15809
 
5.9%
, 11823
 
4.4%
n 11463
 
4.3%
t 10948
 
4.1%
a 9874
 
3.7%
Other values (33) 99948
37.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 268597
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 31307
 
11.7%
24374
 
9.1%
o 18707
 
7.0%
r 17695
 
6.6%
d 16649
 
6.2%
s 15809
 
5.9%
, 11823
 
4.4%
n 11463
 
4.3%
t 10948
 
4.1%
a 9874
 
3.7%
Other values (33) 99948
37.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 268597
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 31307
 
11.7%
24374
 
9.1%
o 18707
 
7.0%
r 17695
 
6.6%
d 16649
 
6.2%
s 15809
 
5.9%
, 11823
 
4.4%
n 11463
 
4.3%
t 10948
 
4.1%
a 9874
 
3.7%
Other values (33) 99948
37.2%
Distinct584
Distinct (%)4.2%
Missing5964
Missing (%)30.2%
Memory size154.2 KiB
2024-11-05T21:36:58.153157image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Length

Max length129
Median length108
Mean length36.025522
Min length4

Characters and Unicode

Total characters495459
Distinct characters37
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique202 ?
Unique (%)1.5%

Sample

1st rowNone
2nd row Scikit-learn , TensorFlow , Keras , RandomForest
3rd row Scikit-learn , RandomForest, Xgboost , LightGBM
4th row Scikit-learn , TensorFlow , Keras , RandomForest, Xgboost , Caret
5th row Scikit-learn , TensorFlow , Keras , PyTorch
ValueCountFrequency (%)
23108
35.9%
scikit-learn 9390
14.6%
tensorflow 5822
 
9.0%
keras 5756
 
8.9%
randomforest 4524
 
7.0%
xgboost 4243
 
6.6%
pytorch 3412
 
5.3%
lightgbm 2166
 
3.4%
none 1720
 
2.7%
caret 1139
 
1.8%
Other values (4) 3111
 
4.8%
2024-11-05T21:36:58.684785image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
114840
23.2%
o 34310
 
6.9%
r 31295
 
6.3%
e 28693
 
5.8%
, 26620
 
5.4%
a 23617
 
4.8%
i 22805
 
4.6%
t 22753
 
4.6%
n 21456
 
4.3%
s 21294
 
4.3%
Other values (27) 147776
29.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 495459
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
114840
23.2%
o 34310
 
6.9%
r 31295
 
6.3%
e 28693
 
5.8%
, 26620
 
5.4%
a 23617
 
4.8%
i 22805
 
4.6%
t 22753
 
4.6%
n 21456
 
4.3%
s 21294
 
4.3%
Other values (27) 147776
29.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 495459
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
114840
23.2%
o 34310
 
6.9%
r 31295
 
6.3%
e 28693
 
5.8%
, 26620
 
5.4%
a 23617
 
4.8%
i 22805
 
4.6%
t 22753
 
4.6%
n 21456
 
4.3%
s 21294
 
4.3%
Other values (27) 147776
29.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 495459
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
114840
23.2%
o 34310
 
6.9%
r 31295
 
6.3%
e 28693
 
5.8%
, 26620
 
5.4%
a 23617
 
4.8%
i 22805
 
4.6%
t 22753
 
4.6%
n 21456
 
4.3%
s 21294
 
4.3%
Other values (27) 147776
29.8%
Distinct183
Distinct (%)2.6%
Missing12592
Missing (%)63.9%
Memory size154.2 KiB
2024-11-05T21:36:58.960353image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Length

Max length189
Median length170
Mean length26.534316
Min length4

Characters and Unicode

Total characters189057
Distinct characters38
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique82 ?
Unique (%)1.2%

Sample

1st row Microsoft Azure
2nd row Amazon Web Services (AWS)
3rd row Google Cloud Platform (GCP) , Amazon Web Services (AWS) , Microsoft Azure
4th rowNone
5th row Google Cloud Platform (GCP)
ValueCountFrequency (%)
cloud 3233
10.9%
web 2758
9.3%
services 2758
9.3%
aws 2758
9.3%
amazon 2758
9.3%
2621
8.9%
none 2229
7.5%
google 2134
7.2%
platform 2134
7.2%
gcp 2134
7.2%
Other values (11) 4059
13.7%
2024-11-05T21:36:59.502310image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
34524
18.3%
o 17449
 
9.2%
e 14804
 
7.8%
r 8222
 
4.3%
l 7888
 
4.2%
A 7072
 
3.7%
S 5721
 
3.0%
a 5638
 
3.0%
W 5516
 
2.9%
C 5367
 
2.8%
Other values (28) 76856
40.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 189057
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
34524
18.3%
o 17449
 
9.2%
e 14804
 
7.8%
r 8222
 
4.3%
l 7888
 
4.2%
A 7072
 
3.7%
S 5721
 
3.0%
a 5638
 
3.0%
W 5516
 
2.9%
C 5367
 
2.8%
Other values (28) 76856
40.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 189057
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
34524
18.3%
o 17449
 
9.2%
e 14804
 
7.8%
r 8222
 
4.3%
l 7888
 
4.2%
A 7072
 
3.7%
S 5721
 
3.0%
a 5638
 
3.0%
W 5516
 
2.9%
C 5367
 
2.8%
Other values (28) 76856
40.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 189057
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
34524
18.3%
o 17449
 
9.2%
e 14804
 
7.8%
r 8222
 
4.3%
l 7888
 
4.2%
A 7072
 
3.7%
S 5721
 
3.0%
a 5638
 
3.0%
W 5516
 
2.9%
C 5367
 
2.8%
Other values (28) 76856
40.7%
Distinct336
Distinct (%)4.7%
Missing12617
Missing (%)64.0%
Memory size154.2 KiB
2024-11-05T21:36:59.907770image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Length

Max length231
Median length224
Mean length27.010986
Min length4

Characters and Unicode

Total characters191778
Distinct characters39
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique155 ?
Unique (%)2.2%

Sample

1st rowAzure Virtual Machines, Azure Container Service
2nd rowAWS Elastic Compute Cloud (EC2)
3rd rowGoogle Compute Engine (GCE), AWS Lambda, Azure Virtual Machines
4th rowNone
5th rowAWS Elastic Compute Cloud (EC2)
ValueCountFrequency (%)
aws 3281
11.2%
none 3155
10.7%
google 2963
10.1%
compute 2948
10.0%
cloud 2512
8.5%
engine 2261
 
7.7%
elastic 2121
 
7.2%
ec2 1810
 
6.2%
azure 1233
 
4.2%
gce 1138
 
3.9%
Other values (11) 6001
20.4%
2024-11-05T21:37:00.889905image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
22323
 
11.6%
e 16711
 
8.7%
o 15638
 
8.2%
n 11546
 
6.0%
C 8803
 
4.6%
u 8759
 
4.6%
l 8745
 
4.6%
t 8364
 
4.4%
i 7550
 
3.9%
E 7330
 
3.8%
Other values (29) 76009
39.6%

Most occurring categories

ValueCountFrequency (%)
(unknown) 191778
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
22323
 
11.6%
e 16711
 
8.7%
o 15638
 
8.2%
n 11546
 
6.0%
C 8803
 
4.6%
u 8759
 
4.6%
l 8745
 
4.6%
t 8364
 
4.4%
i 7550
 
3.9%
E 7330
 
3.8%
Other values (29) 76009
39.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 191778
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
22323
 
11.6%
e 16711
 
8.7%
o 15638
 
8.2%
n 11546
 
6.0%
C 8803
 
4.6%
u 8759
 
4.6%
l 8745
 
4.6%
t 8364
 
4.4%
i 7550
 
3.9%
E 7330
 
3.8%
Other values (29) 76009
39.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 191778
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
22323
 
11.6%
e 16711
 
8.7%
o 15638
 
8.2%
n 11546
 
6.0%
C 8803
 
4.6%
u 8759
 
4.6%
l 8745
 
4.6%
t 8364
 
4.4%
i 7550
 
3.9%
E 7330
 
3.8%
Other values (29) 76009
39.6%
Distinct287
Distinct (%)4.1%
Missing12639
Missing (%)64.1%
Memory size154.2 KiB
2024-11-05T21:37:01.146374image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Length

Max length173
Median length4
Mean length13.882311
Min length4

Characters and Unicode

Total characters98259
Distinct characters40
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique143 ?
Unique (%)2.0%

Sample

1st rowDatabricks, Microsoft Analysis Services
2nd rowAWS Elastic MapReduce
3rd rowGoogle BigQuery, Databricks
4th rowNone
5th rowGoogle Cloud Dataflow
ValueCountFrequency (%)
none 4133
27.6%
google 1881
12.5%
aws 1641
 
10.9%
bigquery 958
 
6.4%
cloud 923
 
6.2%
databricks 604
 
4.0%
redshift 562
 
3.7%
dataflow 525
 
3.5%
elastic 429
 
2.9%
mapreduce 429
 
2.9%
Other values (8) 2914
19.4%
2024-11-05T21:37:01.653505image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 10482
 
10.7%
o 10195
 
10.4%
7921
 
8.1%
n 5209
 
5.3%
a 4877
 
5.0%
i 4393
 
4.5%
l 4184
 
4.3%
N 4133
 
4.2%
s 3861
 
3.9%
t 3503
 
3.6%
Other values (30) 39501
40.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 98259
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 10482
 
10.7%
o 10195
 
10.4%
7921
 
8.1%
n 5209
 
5.3%
a 4877
 
5.0%
i 4393
 
4.5%
l 4184
 
4.3%
N 4133
 
4.2%
s 3861
 
3.9%
t 3503
 
3.6%
Other values (30) 39501
40.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 98259
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 10482
 
10.7%
o 10195
 
10.4%
7921
 
8.1%
n 5209
 
5.3%
a 4877
 
5.0%
i 4393
 
4.5%
l 4184
 
4.3%
N 4133
 
4.2%
s 3861
 
3.9%
t 3503
 
3.6%
Other values (30) 39501
40.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 98259
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 10482
 
10.7%
o 10195
 
10.4%
7921
 
8.1%
n 5209
 
5.3%
a 4877
 
5.0%
i 4393
 
4.5%
l 4184
 
4.3%
N 4133
 
4.2%
s 3861
 
3.9%
t 3503
 
3.6%
Other values (30) 39501
40.2%
Distinct272
Distinct (%)3.9%
Missing12667
Missing (%)64.2%
Memory size154.2 KiB
2024-11-05T21:37:01.911736image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Length

Max length219
Median length4
Mean length16.253617
Min length3

Characters and Unicode

Total characters114588
Distinct characters34
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique126 ?
Unique (%)1.8%

Sample

1st rowAzure Machine Learning Studio
2nd rowRapidMiner
3rd rowSAS, Azure Machine Learning Studio, Google Cloud Machine Learning Engine
4th rowNone
5th rowGoogle Cloud Translation
ValueCountFrequency (%)
none 4313
25.3%
cloud 2111
12.4%
google 2111
12.4%
machine 1167
 
6.8%
learning 1167
 
6.8%
engine 586
 
3.4%
azure 581
 
3.4%
studio 581
 
3.4%
amazon 569
 
3.3%
sagemaker 569
 
3.3%
Other values (9) 3283
19.3%
2024-11-05T21:37:02.415153image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 13685
 
11.9%
o 13339
 
11.6%
n 11222
 
9.8%
9988
 
8.7%
a 6954
 
6.1%
l 5355
 
4.7%
g 5233
 
4.6%
i 5090
 
4.4%
N 4713
 
4.1%
u 4491
 
3.9%
Other values (24) 34518
30.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 114588
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 13685
 
11.9%
o 13339
 
11.6%
n 11222
 
9.8%
9988
 
8.7%
a 6954
 
6.1%
l 5355
 
4.7%
g 5233
 
4.6%
i 5090
 
4.4%
N 4713
 
4.1%
u 4491
 
3.9%
Other values (24) 34518
30.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 114588
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 13685
 
11.9%
o 13339
 
11.6%
n 11222
 
9.8%
9988
 
8.7%
a 6954
 
6.1%
l 5355
 
4.7%
g 5233
 
4.6%
i 5090
 
4.4%
N 4713
 
4.1%
u 4491
 
3.9%
Other values (24) 34518
30.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 114588
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 13685
 
11.9%
o 13339
 
11.6%
n 11222
 
9.8%
9988
 
8.7%
a 6954
 
6.1%
l 5355
 
4.7%
g 5233
 
4.6%
i 5090
 
4.4%
N 4713
 
4.1%
u 4491
 
3.9%
Other values (24) 34518
30.1%
Distinct201
Distinct (%)2.9%
Missing12702
Missing (%)64.4%
Memory size154.2 KiB
2024-11-05T21:37:02.648044image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Length

Max length153
Median length4
Mean length9.4597292
Min length4

Characters and Unicode

Total characters66360
Distinct characters39
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique100 ?
Unique (%)1.4%

Sample

1st rowNone
2nd row Auto-Keras
3rd row Google AutoML , Tpot , Auto-Keras , Auto-Sklearn , Auto_ml
4th rowNone
5th row Google AutoML
ValueCountFrequency (%)
none 5175
47.2%
1266
 
11.6%
automl 860
 
7.8%
auto-sklearn 756
 
6.9%
google 498
 
4.5%
auto-keras 465
 
4.2%
auto_ml 279
 
2.5%
h20 277
 
2.5%
driverless 277
 
2.5%
ai 277
 
2.5%
Other values (6) 831
 
7.6%
2024-11-05T21:37:03.274488image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
10742
16.2%
o 9182
13.8%
e 7608
11.5%
n 5931
 
8.9%
N 5175
 
7.8%
t 3201
 
4.8%
A 2637
 
4.0%
u 2360
 
3.6%
r 2098
 
3.2%
a 1945
 
2.9%
Other values (29) 15481
23.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 66360
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
10742
16.2%
o 9182
13.8%
e 7608
11.5%
n 5931
 
8.9%
N 5175
 
7.8%
t 3201
 
4.8%
A 2637
 
4.0%
u 2360
 
3.6%
r 2098
 
3.2%
a 1945
 
2.9%
Other values (29) 15481
23.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 66360
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
10742
16.2%
o 9182
13.8%
e 7608
11.5%
n 5931
 
8.9%
N 5175
 
7.8%
t 3201
 
4.8%
A 2637
 
4.0%
u 2360
 
3.6%
r 2098
 
3.2%
a 1945
 
2.9%
Other values (29) 15481
23.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 66360
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
10742
16.2%
o 9182
13.8%
e 7608
11.5%
n 5931
 
8.9%
N 5175
 
7.8%
t 3201
 
4.8%
A 2637
 
4.0%
u 2360
 
3.6%
r 2098
 
3.2%
a 1945
 
2.9%
Other values (29) 15481
23.3%
Distinct454
Distinct (%)6.5%
Missing12723
Missing (%)64.5%
Memory size154.2 KiB
2024-11-05T21:37:03.630682image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Length

Max length168
Median length146
Mean length24.700743
Min length4

Characters and Unicode

Total characters172757
Distinct characters36
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique196 ?
Unique (%)2.8%

Sample

1st rowAzure SQL Database
2nd rowPostgresSQL, AWS Relational Database Service
3rd rowMySQL, PostgresSQL
4th rowMySQL
5th rowMySQL
ValueCountFrequency (%)
mysql 3122
13.2%
sql 2857
12.1%
microsoft 2399
10.2%
database 2259
9.6%
postgressql 2160
9.2%
server 1852
7.9%
sqlite 1527
 
6.5%
none 1245
 
5.3%
oracle 1192
 
5.1%
aws 1003
 
4.3%
Other values (8) 3956
16.8%
2024-11-05T21:37:04.439877image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
16578
 
9.6%
e 15690
 
9.1%
S 13109
 
7.6%
r 10809
 
6.3%
o 10784
 
6.2%
s 10072
 
5.8%
Q 9666
 
5.6%
L 9666
 
5.6%
a 9560
 
5.5%
t 9220
 
5.3%
Other values (26) 57603
33.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 172757
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
16578
 
9.6%
e 15690
 
9.1%
S 13109
 
7.6%
r 10809
 
6.3%
o 10784
 
6.2%
s 10072
 
5.8%
Q 9666
 
5.6%
L 9666
 
5.6%
a 9560
 
5.5%
t 9220
 
5.3%
Other values (26) 57603
33.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 172757
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
16578
 
9.6%
e 15690
 
9.1%
S 13109
 
7.6%
r 10809
 
6.3%
o 10784
 
6.2%
s 10072
 
5.8%
Q 9666
 
5.6%
L 9666
 
5.6%
a 9560
 
5.5%
t 9220
 
5.3%
Other values (26) 57603
33.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 172757
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
16578
 
9.6%
e 15690
 
9.1%
S 13109
 
7.6%
r 10809
 
6.3%
o 10784
 
6.2%
s 10072
 
5.8%
Q 9666
 
5.6%
L 9666
 
5.6%
a 9560
 
5.5%
t 9220
 
5.3%
Other values (26) 57603
33.3%

Correlations

2024-11-05T21:37:04.765118image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Approximately how many individuals are responsible for data science workloads at your place of business?Approximately how much money have you spent on machine learning and/or cloud computing products at your work in the past 5 years?Does your current employer incorporate machine learning methods into their business?For how many years have you used machine learning methods?Have you ever used a TPU (tensor processing unit)?How long have you been writing code to analyze data (at work or at school)?Select the title most similar to your current role (or most recent title if retired)What is the highest level of formal education that you have attained or plan to attain within the next 2 years?What is the size of the company where you are employed?What is your age (# years)?What is your current yearly compensation (approximate $USD)?What is your gender?What programming language would you recommend an aspiring data scientist to learn first?Which categories of computer vision methods do you use on a regular basis?Which of the following natural language processing (NLP) methods do you use on a regular basis?Which types of specialized hardware do you use on a regular basis?
Approximately how many individuals are responsible for data science workloads at your place of business?1.0000.1550.2440.1120.0340.1180.1070.0570.3020.0380.1080.0140.0250.0000.0550.036
Approximately how much money have you spent on machine learning and/or cloud computing products at your work in the past 5 years?0.1551.0000.1730.1580.0730.1530.0910.0460.1030.0960.2010.0360.0370.0520.0510.089
Does your current employer incorporate machine learning methods into their business?0.2440.1731.0000.2010.0520.1620.1610.0660.1200.0590.1410.0300.0410.0590.0900.099
For how many years have you used machine learning methods?0.1120.1580.2011.0000.0890.4750.1800.1610.0350.1850.1650.0460.0480.0890.0960.109
Have you ever used a TPU (tensor processing unit)?0.0340.0730.0520.0891.0000.0530.0390.0210.0240.0340.0550.0410.0360.1140.1090.278
How long have you been writing code to analyze data (at work or at school)?0.1180.1530.1620.4750.0531.0000.1880.1520.0580.2710.2050.0480.0660.0690.0780.077
Select the title most similar to your current role (or most recent title if retired)0.1070.0910.1610.1800.0390.1881.0000.1820.0500.1940.0690.0560.0750.0440.0620.070
What is the highest level of formal education that you have attained or plan to attain within the next 2 years?0.0570.0460.0660.1610.0210.1520.1821.0000.0670.1860.0860.0800.0510.0650.0000.036
What is the size of the company where you are employed?0.3020.1030.1200.0350.0240.0580.0500.0671.0000.0830.1390.0130.0390.0180.0430.042
What is your age (# years)?0.0380.0960.0590.1850.0340.2710.1940.1860.0831.0000.1490.0620.0610.0400.0300.029
What is your current yearly compensation (approximate $USD)?0.1080.2010.1410.1650.0550.2050.0690.0860.1390.1491.0000.0590.0400.0550.0500.045
What is your gender?0.0140.0360.0300.0460.0410.0480.0560.0800.0130.0620.0591.0000.0530.0880.0410.110
What programming language would you recommend an aspiring data scientist to learn first?0.0250.0370.0410.0480.0360.0660.0750.0510.0390.0610.0400.0531.0000.0600.0780.063
Which categories of computer vision methods do you use on a regular basis?0.0000.0520.0590.0890.1140.0690.0440.0650.0180.0400.0550.0880.0601.0000.1390.174
Which of the following natural language processing (NLP) methods do you use on a regular basis?0.0550.0510.0900.0960.1090.0780.0620.0000.0430.0300.0500.0410.0780.1391.0000.099
Which types of specialized hardware do you use on a regular basis?0.0360.0890.0990.1090.2780.0770.0700.0360.0420.0290.0450.1100.0630.1740.0991.000

Missing values

2024-11-05T21:36:29.349302image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
A simple visualization of nullity by column.
2024-11-05T21:36:31.067486image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-11-05T21:36:33.678298image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

What is your age (# years)?What is your gender?In which country do you currently reside?What is the highest level of formal education that you have attained or plan to attain within the next 2 years?Select the title most similar to your current role (or most recent title if retired)What is the size of the company where you are employed?Approximately how many individuals are responsible for data science workloads at your place of business?Does your current employer incorporate machine learning methods into their business?What is your current yearly compensation (approximate $USD)?Approximately how much money have you spent on machine learning and/or cloud computing products at your work in the past 5 years?What is the primary tool that you use at work or school to analyze data?How long have you been writing code to analyze data (at work or at school)?What programming language would you recommend an aspiring data scientist to learn first?Have you ever used a TPU (tensor processing unit)?For how many years have you used machine learning methods?Select any activities that make up an important part of your role at work:Who/what are your favorite media sources that report on data science topics?On which platforms have you begun or completed data science courses?Which of the following integrated development environments (IDE's) do you use on a regular basis?Which of the following hosted notebook products do you use on a regular basis?What programming languages do you use on a regular basis?What data visualization libraries or tools do you use on a regular basis?Which types of specialized hardware do you use on a regular basis?Which of the following ML algorithms do you use on a regular basis?Which categories of ML tools do you use on a regular basis?Which categories of computer vision methods do you use on a regular basis?Which of the following natural language processing (NLP) methods do you use on a regular basis?Which of the following machine learning frameworks do you use on a regular basis?Which of the following cloud computing platforms do you use on a regular basis?Which specific cloud computing products do you use on a regular basis?Which specific big data / analytics products do you use on a regular basis?Which of the following machine learning products do you use on a regular basis?Which automated machine learning tools (or partial AutoML tools) do you use on a regular basis?Which of the following relational database products do you use on a regular basis?
022-24MaleFranceMaster’s degreeSoftware Engineer1000-9,999 employees0I do not know30,000-39,999$0 (USD)Basic statistical software (Microsoft Excel, Google Sheets, etc.), 0, -1, -1, -1, -11-2 yearsPythonNever1-2 yearsNaNTwitter (data science influencers), Kaggle (forums, blog, social media, etc), Blogs (Towards Data Science, Medium, Analytics Vidhya, KDnuggets etc), Journal Publications (traditional publications, preprint journals, etc)Coursera, DataCamp, Kaggle Courses (i.e. Kaggle Learn), UdemyJupyter (JupyterLab, Jupyter Notebooks, etc) , RStudio , PyCharm , MATLAB , SpyderNonePython, R, SQL, Java, Javascript, MATLABMatplotlibCPUs, GPUsLinear or Logistic RegressionNoneNaNNaNNoneNaNNaNNaNNaNNaNNaN
140-44MaleIndiaProfessional degreeSoftware Engineer> 10,000 employees20+We have well established ML methods (i.e., models in production for more than 2 years)5,000-7,499> $100,000 ($USD)Cloud-based data software & APIs (AWS, GCP, Azure, etc.), -1, -1, -1, -1, 0I have never written codeNaNNaNNaNAnalyze and understand data to influence product or business decisions, Build and/or run the data infrastructure that my business uses for storing, analyzing, and operationalizing data, Build prototypes to explore applying machine learning to new areas, Build and/or run a machine learning service that operationally improves my product or workflowsKaggle (forums, blog, social media, etc), YouTube (Cloud AI Adventures, Siraj Raval, etc), Podcasts (Chai Time Data Science, Linear Digressions, etc), Blogs (Towards Data Science, Medium, Analytics Vidhya, KDnuggets etc)Coursera, DataCamp, Kaggle Courses (i.e. Kaggle Learn), UdemyNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
255-59FemaleGermanyProfessional degreeNaNNaNNaNNaNNaNNaN-1, -1, -1, -1, -1NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
340-44MaleAustraliaMaster’s degreeOther> 10,000 employees20+I do not know250,000-299,999$10,000-$99,999Local development environments (RStudio, JupyterLab, etc.), -1, -1, -1, 0, -11-2 yearsPythonOnce2-3 yearsNaNPodcasts (Chai Time Data Science, Linear Digressions, etc), Blogs (Towards Data Science, Medium, Analytics Vidhya, KDnuggets etc), Journal Publications (traditional publications, preprint journals, etc), Slack Communities (ods.ai, kagglenoobs, etc)Coursera, edX, DataCamp, University Courses (resulting in a university degree)Jupyter (JupyterLab, Jupyter Notebooks, etc) , Visual Studio / Visual Studio CodeMicrosoft Azure NotebooksPython, R, SQL, BashGgplot / ggplot2 , Matplotlib , SeabornCPUs, GPUsLinear or Logistic Regression, Convolutional Neural NetworksAutomation of full ML pipelines (e.g. Google AutoML, H20 Driverless AI)General purpose image/video tools (PIL, cv2, skimage, etc), Image classification and other general purpose networks (VGG, Inception, ResNet, ResNeXt, NASNet, EfficientNet, etc)NaNScikit-learn , TensorFlow , Keras , RandomForestMicrosoft AzureAzure Virtual Machines, Azure Container ServiceDatabricks, Microsoft Analysis ServicesAzure Machine Learning StudioNoneAzure SQL Database
422-24MaleIndiaBachelor’s degreeOther0-49 employees0No (we do not use ML methods)4,000-4,999$0 (USD)Local development environments (RStudio, JupyterLab, etc.), -1, -1, -1, 1, -1< 1 yearsPythonNever< 1 yearsNaNYouTube (Cloud AI Adventures, Siraj Raval, etc), Blogs (Towards Data Science, Medium, Analytics Vidhya, KDnuggets etc), OtherOtherJupyter (JupyterLab, Jupyter Notebooks, etc)Google Colab , Google Cloud Notebook Products (AI Platform, Datalab, etc)Python, SQLMatplotlib , Plotly / Plotly Express , SeabornCPUs, GPUsLinear or Logistic Regression, Decision Trees or Random Forests, Gradient Boosting Machines (xgboost, lightgbm, etc)NoneNaNNaNScikit-learn , RandomForest, Xgboost , LightGBMNaNNaNNaNNaNNaNNaN
550-54MaleFranceMaster’s degreeData Scientist0-49 employees3-4We have well established ML methods (i.e., models in production for more than 2 years)60,000-69,999$10,000-$99,999Advanced statistical software (SPSS, SAS, etc.), -1, 0, -1, -1, -120+ yearsJavaNever10-15 yearsBuild prototypes to explore applying machine learning to new areas, Do research that advances the state of the art of machine learningYouTube (Cloud AI Adventures, Siraj Raval, etc), Blogs (Towards Data Science, Medium, Analytics Vidhya, KDnuggets etc), Journal Publications (traditional publications, preprint journals, etc)NoneRStudio , OtherNonePython, RGgplot / ggplot2CPUs, GPUsLinear or Logistic Regression, Decision Trees or Random Forests, Gradient Boosting Machines (xgboost, lightgbm, etc), Bayesian Approaches, Convolutional Neural Networks, Generative Adversarial Networks, Recurrent Neural NetworksAutomated model selection (e.g. auto-sklearn, xcessiv), Automated hyperparameter tuning (e.g. hyperopt, ray.tune), Automation of full ML pipelines (e.g. Google AutoML, H20 Driverless AI)NoneWord embeddings/vectors (GLoVe, fastText, word2vec), Encoder-decorder models (seq2seq, vanilla transformers)Scikit-learn , TensorFlow , Keras , RandomForest, Xgboost , CaretAmazon Web Services (AWS)AWS Elastic Compute Cloud (EC2)AWS Elastic MapReduceRapidMinerAuto-KerasPostgresSQL, AWS Relational Database Service
622-24MaleIndiaMaster’s degreeData Scientist50-249 employees20+We are exploring ML methods (and may one day put a model into production)10,000-14,999$100-$999Local development environments (RStudio, JupyterLab, etc.), -1, -1, -1, 2, -13-5 yearsPython6-24 times2-3 yearsAnalyze and understand data to influence product or business decisions, Experimentation and iteration to improve existing ML models, Do research that advances the state of the art of machine learningKaggle (forums, blog, social media, etc), Course Forums (forums.fast.ai, etc), YouTube (Cloud AI Adventures, Siraj Raval, etc), Podcasts (Chai Time Data Science, Linear Digressions, etc), Journal Publications (traditional publications, preprint journals, etc)Udacity, Coursera, edX, Kaggle Courses (i.e. Kaggle Learn), UdemyJupyter (JupyterLab, Jupyter Notebooks, etc) , Spyder , Notepad++ , Sublime TextKaggle Notebooks (Kernels) , Google Colab , Binder / JupyterHubPython, R, BashMatplotlib , Plotly / Plotly Express , Bokeh , SeabornCPUs, GPUsLinear or Logistic Regression, Dense Neural Networks (MLPs, etc), Convolutional Neural Networks, Recurrent Neural NetworksAutomated data augmentation (e.g. imgaug, albumentations), Automated feature engineering/selection (e.g. tpot, boruta_py), Automated model selection (e.g. auto-sklearn, xcessiv), Automation of full ML pipelines (e.g. Google AutoML, H20 Driverless AI)General purpose image/video tools (PIL, cv2, skimage, etc), Image segmentation methods (U-Net, Mask R-CNN, etc), Object detection methods (YOLOv3, RetinaNet, etc)Word embeddings/vectors (GLoVe, fastText, word2vec), Encoder-decorder models (seq2seq, vanilla transformers)Scikit-learn , TensorFlow , Keras , PyTorchGoogle Cloud Platform (GCP) , Amazon Web Services (AWS) , Microsoft AzureGoogle Compute Engine (GCE), AWS Lambda, Azure Virtual MachinesGoogle BigQuery, DatabricksSAS, Azure Machine Learning Studio, Google Cloud Machine Learning EngineGoogle AutoML , Tpot , Auto-Keras , Auto-Sklearn , Auto_mlMySQL, PostgresSQL
722-24FemaleUnited States of AmericaBachelor’s degreeData Scientist> 10,000 employees20+We recently started using ML methods (i.e., models in production for less than 2 years)80,000-89,999$0 (USD)Local development environments (RStudio, JupyterLab, etc.), -1, -1, -1, 3, -13-5 yearsPythonOnce3-4 yearsAnalyze and understand data to influence product or business decisions, Build prototypes to explore applying machine learning to new areas, Build and/or run a machine learning service that operationally improves my product or workflowsHacker News (https://news.ycombinator.com/), Blogs (Towards Data Science, Medium, Analytics Vidhya, KDnuggets etc), Journal Publications (traditional publications, preprint journals, etc)Udemy, University Courses (resulting in a university degree)Jupyter (JupyterLab, Jupyter Notebooks, etc) , SpyderMicrosoft Azure Notebooks , AWS Notebook Products (EMR Notebooks, Sagemaker Notebooks, etc)PythonMatplotlib , Plotly / Plotly ExpressCPUsLinear or Logistic Regression, Decision Trees or Random Forests, Convolutional Neural NetworksNoneGeneral purpose image/video tools (PIL, cv2, skimage, etc), Image classification and other general purpose networks (VGG, Inception, ResNet, ResNeXt, NASNet, EfficientNet, etc)NaNScikit-learn , TensorFlow , Keras , Spark MLibNaNNaNNaNNaNNaNNaN
822-24MaleUnited States of AmericaBachelor’s degreeStudentNaNNaNNaNNaNNaNLocal development environments (RStudio, JupyterLab, etc.), -1, -1, -1, 4, -13-5 yearsPythonNever1-2 yearsNaNKaggle (forums, blog, social media, etc), Blogs (Towards Data Science, Medium, Analytics Vidhya, KDnuggets etc), Journal Publications (traditional publications, preprint journals, etc)Kaggle Courses (i.e. Kaggle Learn), University Courses (resulting in a university degree)Jupyter (JupyterLab, Jupyter Notebooks, etc) , PyCharm , AtomGoogle ColabPythonMatplotlib , SeabornCPUs, GPUsLinear or Logistic Regression, Decision Trees or Random Forests, Gradient Boosting Machines (xgboost, lightgbm, etc), Bayesian Approaches, Evolutionary Approaches, Dense Neural Networks (MLPs, etc), Convolutional Neural NetworksNoneGeneral purpose image/video tools (PIL, cv2, skimage, etc), Image classification and other general purpose networks (VGG, Inception, ResNet, ResNeXt, NASNet, EfficientNet, etc)NaNScikit-learn , Xgboost , PyTorch , LightGBMNaNNaNNaNNaNNaNNaN
955-59MaleNetherlandsMaster’s degreeOther0-49 employees1-2We are exploring ML methods (and may one day put a model into production)$0-999$100-$999Local development environments (RStudio, JupyterLab, etc.), -1, -1, -1, 5, -15-10 yearsPythonNever< 1 yearsOtherKaggle (forums, blog, social media, etc), Course Forums (forums.fast.ai, etc), Blogs (Towards Data Science, Medium, Analytics Vidhya, KDnuggets etc)CourseraJupyter (JupyterLab, Jupyter Notebooks, etc)NonePython, SQLMatplotlib , D3.js , SeabornCPUsLinear or Logistic Regression, Bayesian Approaches, Generative Adversarial NetworksNoneNoneNaNScikit-learn , PyTorchNoneNoneNoneNoneNoneMySQL
What is your age (# years)?What is your gender?In which country do you currently reside?What is the highest level of formal education that you have attained or plan to attain within the next 2 years?Select the title most similar to your current role (or most recent title if retired)What is the size of the company where you are employed?Approximately how many individuals are responsible for data science workloads at your place of business?Does your current employer incorporate machine learning methods into their business?What is your current yearly compensation (approximate $USD)?Approximately how much money have you spent on machine learning and/or cloud computing products at your work in the past 5 years?What is the primary tool that you use at work or school to analyze data?How long have you been writing code to analyze data (at work or at school)?What programming language would you recommend an aspiring data scientist to learn first?Have you ever used a TPU (tensor processing unit)?For how many years have you used machine learning methods?Select any activities that make up an important part of your role at work:Who/what are your favorite media sources that report on data science topics?On which platforms have you begun or completed data science courses?Which of the following integrated development environments (IDE's) do you use on a regular basis?Which of the following hosted notebook products do you use on a regular basis?What programming languages do you use on a regular basis?What data visualization libraries or tools do you use on a regular basis?Which types of specialized hardware do you use on a regular basis?Which of the following ML algorithms do you use on a regular basis?Which categories of ML tools do you use on a regular basis?Which categories of computer vision methods do you use on a regular basis?Which of the following natural language processing (NLP) methods do you use on a regular basis?Which of the following machine learning frameworks do you use on a regular basis?Which of the following cloud computing platforms do you use on a regular basis?Which specific cloud computing products do you use on a regular basis?Which specific big data / analytics products do you use on a regular basis?Which of the following machine learning products do you use on a regular basis?Which automated machine learning tools (or partial AutoML tools) do you use on a regular basis?Which of the following relational database products do you use on a regular basis?
1970718-21MaleViet NamNaNNaNNaNNaNNaNNaNNaN-1, -1, -1, -1, -1NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
1970825-29FemaleIndiaProfessional degreeNot employedNaNNaNNaNNaNNaN-1, -1, -1, -1, -1NaNNaNNaNNaNNaNBlogs (Towards Data Science, Medium, Analytics Vidhya, KDnuggets etc)Coursera, DataCamp, Kaggle Courses (i.e. Kaggle Learn), LinkedIn Learning, University Courses (resulting in a university degree)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
1970925-29Prefer not to sayAustriaNo formal education past high schoolData Scientist250-999 employees1-2We use ML methods for generating insights (but do not put working models into production)1,000-1,999NaN-1, -1, -1, -1, -1NaNNaNNaNNaNAnalyze and understand data to influence product or business decisionsNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
1971022-24MaleIndiaBachelor’s degreeData Scientist50-249 employeesNaNNaNNaNNaN-1, -1, -1, -1, -1NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
1971118-21MaleIndiaMaster’s degreeStudentNaNNaNNaNNaNNaN-1, -1, -1, -1, -1NaNNaNNaNNaNNaNKaggle (forums, blog, social media, etc), Blogs (Towards Data Science, Medium, Analytics Vidhya, KDnuggets etc), Journal Publications (traditional publications, preprint journals, etc)CourseraNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
1971250-54MaleJapanNaNNaNNaNNaNNaNNaNNaN-1, -1, -1, -1, -1NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
1971318-21MaleIndiaBachelor’s degreeOther250-999 employees3-4I do not know$0-999$0 (USD)Local development environments (RStudio, JupyterLab, etc.), -1, -1, -1, 28, -11-2 yearsNaNNaNNaNNaNReddit (r/machinelearning, r/datascience, etc)DataCamp, UdemyJupyter (JupyterLab, Jupyter Notebooks, etc) , RStudio , PyCharm , Visual Studio / Visual Studio Code , Spyder , Notepad++ , Sublime TextNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
1971435-39MaleIndiaMaster’s degreeStudentNaNNaNNaNNaNNaN-1, -1, -1, -1, -1NaNNaNNaNNaNNaNKaggle (forums, blog, social media, etc), Course Forums (forums.fast.ai, etc), YouTube (Cloud AI Adventures, Siraj Raval, etc), Blogs (Towards Data Science, Medium, Analytics Vidhya, KDnuggets etc), Journal Publications (traditional publications, preprint journals, etc)Coursera, Kaggle Courses (i.e. Kaggle Learn)NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
1971525-29MaleIndiaMaster’s degreeStatistician50-249 employees15-19We recently started using ML methods (i.e., models in production for less than 2 years)1,000-1,999NaN-1, -1, -1, -1, -1NaNNaNNaNNaNOtherNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
1971650-54MaleFranceBachelor’s degreeSoftware Engineer> 10,000 employees20+We have well established ML methods (i.e., models in production for more than 2 years)60,000-69,999$0 (USD)Local development environments (RStudio, JupyterLab, etc.), -1, -1, -1, 25, -13-5 yearsPythonNever4-5 yearsBuild and/or run the data infrastructure that my business uses for storing, analyzing, and operationalizing data, Build prototypes to explore applying machine learning to new areasBlogs (Towards Data Science, Medium, Analytics Vidhya, KDnuggets etc)Coursera, edX, UdemyJupyter (JupyterLab, Jupyter Notebooks, etc) , Visual Studio / Visual Studio CodeIBM Watson StudioPython, SQL, Java, BashMatplotlibCPUsLinear or Logistic Regression, Decision Trees or Random ForestsAutomated model selection (e.g. auto-sklearn, xcessiv), Automated hyperparameter tuning (e.g. hyperopt, ray.tune)NaNNaNScikit-learn , Spark MLibNaNNaNNaNNaNNaNNaN

Duplicate rows

Most frequently occurring

What is your age (# years)?What is your gender?In which country do you currently reside?What is the highest level of formal education that you have attained or plan to attain within the next 2 years?Select the title most similar to your current role (or most recent title if retired)What is the size of the company where you are employed?Approximately how many individuals are responsible for data science workloads at your place of business?Does your current employer incorporate machine learning methods into their business?What is your current yearly compensation (approximate $USD)?Approximately how much money have you spent on machine learning and/or cloud computing products at your work in the past 5 years?What is the primary tool that you use at work or school to analyze data?How long have you been writing code to analyze data (at work or at school)?What programming language would you recommend an aspiring data scientist to learn first?Have you ever used a TPU (tensor processing unit)?For how many years have you used machine learning methods?Select any activities that make up an important part of your role at work:Who/what are your favorite media sources that report on data science topics?On which platforms have you begun or completed data science courses?Which of the following integrated development environments (IDE's) do you use on a regular basis?Which of the following hosted notebook products do you use on a regular basis?What programming languages do you use on a regular basis?What data visualization libraries or tools do you use on a regular basis?Which types of specialized hardware do you use on a regular basis?Which of the following ML algorithms do you use on a regular basis?Which categories of ML tools do you use on a regular basis?Which categories of computer vision methods do you use on a regular basis?Which of the following natural language processing (NLP) methods do you use on a regular basis?Which of the following machine learning frameworks do you use on a regular basis?Which of the following cloud computing platforms do you use on a regular basis?Which specific cloud computing products do you use on a regular basis?Which specific big data / analytics products do you use on a regular basis?Which of the following machine learning products do you use on a regular basis?Which automated machine learning tools (or partial AutoML tools) do you use on a regular basis?Which of the following relational database products do you use on a regular basis?# duplicates
2718-21MaleIndiaNaNNaNNaNNaNNaNNaNNaN-1, -1, -1, -1, -1NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN30
7422-24MaleIndiaNaNNaNNaNNaNNaNNaNNaN-1, -1, -1, -1, -1NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN17
2018-21MaleIndiaBachelor’s degreeStudentNaNNaNNaNNaNNaN-1, -1, -1, -1, -1NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN12
5522-24MaleChinaNaNNaNNaNNaNNaNNaNNaN-1, -1, -1, -1, -1NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN12
11825-29MaleUnited States of AmericaNaNNaNNaNNaNNaNNaNNaN-1, -1, -1, -1, -1NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN10
13230-34MaleJapanNaNNaNNaNNaNNaNNaNNaN-1, -1, -1, -1, -1NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN10
2118-21MaleIndiaBachelor’s degreeNaNNaNNaNNaNNaNNaN-1, -1, -1, -1, -1NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN9
6722-24MaleIndiaBachelor’s degreeNaNNaNNaNNaNNaNNaN-1, -1, -1, -1, -1NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN8
5322-24MaleChinaMaster’s degreeStudentNaNNaNNaNNaNNaN-1, -1, -1, -1, -1NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN7
9925-29MaleChinaNaNNaNNaNNaNNaNNaNNaN-1, -1, -1, -1, -1NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN7